Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to (a) get rid of cursors, (b) have a nicer interface #2

Open
smurfix opened this issue Jan 17, 2019 · 11 comments
Open

Refactor to (a) get rid of cursors, (b) have a nicer interface #2

smurfix opened this issue Jan 17, 2019 · 11 comments

Comments

@smurfix
Copy link
Collaborator

smurfix commented Jan 17, 2019

The problem is that mysql doesn't have cursors, so why would you want to fake them?

In an async world you don't need to expose cursors to the user anyway. You want to simply call

    async with trio_mysql.connect("localhost") as db:
        async with db.query("select a,b,c …") as result
            async for a,b,c in result:
                await process_record(a,b,c)

instead. SO, none of that fetch_one, fetch_many or fetch_dict stuff – let the record be a named tuple instead, and/or add some options that describe how the result should look like.

If process needs to run a query (like an update), well, you need to save the not-yet-read results in memory – which you should not do when that is not necessary.

Also, cancelling a query requires sending a "kill" command to the server if not all results have been read (if any). The mysql command line client can do that: if you interrupt a query before it returns any result it asks you whether you want to.

Also, mysql now supports nested transactions. This adapter should support them. It should also refuse to accept queries on the connection (or outer transactions) when a(n inner) transaction is active, i.e.

    async with trio_mysql.connect("localhost") as db:
        await db.query("insert into test values(1)")  # works, auto-commits
        async with db.transaction() as tr:
            await tr.query("insert into test values(2)")  # works
            await db.query("insert into test values(3)")  # raises an error
@njsmith
Copy link
Member

njsmith commented Jan 26, 2019

Maybe if there's major refactoring happening, it would also be good to refactor into a sans-io protocol handler + a trio adapter, so that there's a path for eventually reconciling with the sync version of the library?

@smurfix
Copy link
Collaborator Author

smurfix commented Jan 26, 2019

Well, that would depend on the sync version to convert to sans-io. Unfortunately I have too much on my plate to even start a discussion about that. :-/

@Ninpo
Copy link
Member

Ninpo commented Jan 26, 2019

Are there any working examples you're aware of I could use as a reference? I assume by sans-io you mean something like anyio with a TRIO strategy?

I'd be happy to look at that as part of the refactor.

Which brings me to the discussion that started here: #3

Having considered the points @smurfix made at the outset, I completely agree that there's no point bringing bad habits/incorrect implementations into a new library (Trio) that's trying to do exactly the opposite of that in regards to the async world.

To that end I don't see a reason for trio-mysql to not be a complete fork of the pymysql project and thus start versioning again as we see fit in line with the rest of the Trio project (semver I believe?). I don't see it being too difficult to be able to backport any upstream fixes that make sense in the future but I don't think maintaining compatibility with PyMySQL is necessarily a concern anymore considering one of the desires of our refactor is to implement a sane query API, which in MySQL land does not implement cursors or buffered results and therefore a cursor class makes no sense and methods such as fetchone/fetchmany become subject for debate.

We should probably discuss, with things such as fetchone/fetchmany whether they should at least not exist, to at most return some sort of NotImplemented, or if we should see if there's a way we can implement them sanely for any ORMs that may expect them to be there...is it a DBAPI 2.0 requirement? I shall read up on the DBAPI 2.0 spec.

SQLAlchemy core already provides a conn.execute("STMT") style syntax so it's not without precedent. It should be a goal that we can be used in a 'mysql+trio_mysql://' fashion if at all possible with things such as sqlalchemy or even django-channels implementations.

@njsmith
Copy link
Member

njsmith commented Jan 26, 2019

I assume by sans-io you mean something like anyio with a TRIO strategy?

Ah, no, that's a bit of jargon, sorry :-). Sans-io means splitting out the protocol code from the I/O entirely, so you have a pure in-memory protocol handler that just works on bytes: https://sans-io.readthedocs.io/

Maybe the best way to get a feel for this is to play around a bit with h11, maybe work through the tutorial or something: https://h11.readthedocs.io/en/latest/

[Sorry for the close, fat-fingered the button]

@njsmith njsmith closed this as completed Jan 26, 2019
@njsmith njsmith reopened this Jan 26, 2019
@Ninpo
Copy link
Member

Ninpo commented Jan 27, 2019

I assume by sans-io you mean something like anyio with a TRIO strategy?

Ah, no, that's a bit of jargon, sorry :-). Sans-io means splitting out the protocol code from the I/O entirely, so you have a pure in-memory protocol handler that just works on bytes: https://sans-io.readthedocs.io/

Maybe the best way to get a feel for this is to play around a bit with h11, maybe work through the tutorial or something: https://h11.readthedocs.io/en/latest/

[Sorry for the close, fat-fingered the button]

Having read that, if I understand https://sans-io.readthedocs.io/how-to-sans-io.html#integrating-with-i-o correctly we have a choice:

Making trio-mysql a sane mysql implementation that can send/receive bytes and leave the I/O operations to the underlying [a]sync library, a trio socket or a socket or an asyncio socket and have it work with any of those, or

Making trio-mysql a sane mysql implementation that can send/receive bytes but only concern itself with doing so via trio's IO subsystem a la aiohttp as mentioned on that page.

Do I understand that correctly? If that's the case I'd vote for the latter. I've not had a close look through PyMySQL yet but I'm hoping the handling of the bytes and the mysql protocol api are resuable.

@njsmith
Copy link
Member

njsmith commented Jan 27, 2019

The general idea of sans-io is that you split off the mysql command parser/serializer into a standalone piece of code that doesn't deal with I/O at all. It's API is that it can answer questions like "hey, if I wanted to make this request, what bytes should I send?", or "hey, if I got these bytes from the server, what do they mean?"

One advantage of this is that it makes it pretty easy to plug in multiple I/O backends. And we know there are at least two interesting I/O backends – trio and synchronous :-). That's what I was getting at when I mentioned "reconciling with the sync version of the library" – maintaining two independent copies of the mysql parsing/serialization code is a waste of effort (and volunteer projects are always short on volunteers!), so it'd be good if we could eventually have just one copy of that code that's used for both trio and pymysql.

But even if that doesn't happen, the sans-I/O approach still has a lot of benefits. In particular, it makes it really easy to write really thorough tests for your parsing/serialization code, because you don't have to set up sockets and stuff like that, you can just hand it canned data and see what you get back.

That said, refactoring from a non-sans-I/O design to a sans-I/O design is itself a chunk of work, and in the short term would make this library diverge further from pymysql, so it may or may not be the best idea here. But in general it's pretty nice :-)

@njsmith
Copy link
Member

njsmith commented Jan 27, 2019

We should also look at aiomysql, which depends on pymysql, so somehow they've found a way to re-use some of the code without forking... (from the README I suspect this partly involves cut-and-paste, but I haven't dug into the details, and it's worth looking at anyway)

(they're also good candidates to help with any sans-io-friendly refactoring :-))

@Ninpo
Copy link
Member

Ninpo commented Jan 31, 2019

https://dev.mysql.com/doc/refman/5.6/en/cursor-restrictions.html

This suggests MySQL does provide cursors and we're able to use fetchone/fetchmany as expected @smurfix

@smurfix
Copy link
Collaborator Author

smurfix commented Jan 31, 2019

Ah. Interesting. Does any python mysql DBI actually expose that feature?

@Ninpo
Copy link
Member

Ninpo commented Jan 31, 2019 via email

@mhumpula
Copy link

mhumpula commented Jan 4, 2021

Although this is old, I've asked a PyMySQL what they think about it PyMySQL#912

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants