Refactor to (a) get rid of cursors, (b) have a nicer interface #2

smurfix · 2019-01-17T16:31:35Z

The problem is that mysql doesn't have cursors, so why would you want to fake them?

In an async world you don't need to expose cursors to the user anyway. You want to simply call

    async with trio_mysql.connect("localhost") as db:
        async with db.query("select a,b,c …") as result
            async for a,b,c in result:
                await process_record(a,b,c)

instead. SO, none of that fetch_one, fetch_many or fetch_dict stuff – let the record be a named tuple instead, and/or add some options that describe how the result should look like.

If process needs to run a query (like an update), well, you need to save the not-yet-read results in memory – which you should not do when that is not necessary.

Also, cancelling a query requires sending a "kill" command to the server if not all results have been read (if any). The mysql command line client can do that: if you interrupt a query before it returns any result it asks you whether you want to.

Also, mysql now supports nested transactions. This adapter should support them. It should also refuse to accept queries on the connection (or outer transactions) when a(n inner) transaction is active, i.e.

    async with trio_mysql.connect("localhost") as db:
        await db.query("insert into test values(1)")  # works, auto-commits
        async with db.transaction() as tr:
            await tr.query("insert into test values(2)")  # works
            await db.query("insert into test values(3)")  # raises an error

The text was updated successfully, but these errors were encountered:

njsmith · 2019-01-26T05:16:20Z

Maybe if there's major refactoring happening, it would also be good to refactor into a sans-io protocol handler + a trio adapter, so that there's a path for eventually reconciling with the sync version of the library?

smurfix · 2019-01-26T09:56:47Z

Well, that would depend on the sync version to convert to sans-io. Unfortunately I have too much on my plate to even start a discussion about that. :-/

Ninpo · 2019-01-26T17:59:04Z

Are there any working examples you're aware of I could use as a reference? I assume by sans-io you mean something like anyio with a TRIO strategy?

I'd be happy to look at that as part of the refactor.

Which brings me to the discussion that started here: #3

Having considered the points @smurfix made at the outset, I completely agree that there's no point bringing bad habits/incorrect implementations into a new library (Trio) that's trying to do exactly the opposite of that in regards to the async world.

To that end I don't see a reason for trio-mysql to not be a complete fork of the pymysql project and thus start versioning again as we see fit in line with the rest of the Trio project (semver I believe?). I don't see it being too difficult to be able to backport any upstream fixes that make sense in the future but I don't think maintaining compatibility with PyMySQL is necessarily a concern anymore considering one of the desires of our refactor is to implement a sane query API, which in MySQL land does not implement cursors or buffered results and therefore a cursor class makes no sense and methods such as fetchone/fetchmany become subject for debate.

We should probably discuss, with things such as fetchone/fetchmany whether they should at least not exist, to at most return some sort of NotImplemented, or if we should see if there's a way we can implement them sanely for any ORMs that may expect them to be there...is it a DBAPI 2.0 requirement? I shall read up on the DBAPI 2.0 spec.

SQLAlchemy core already provides a conn.execute("STMT") style syntax so it's not without precedent. It should be a goal that we can be used in a 'mysql+trio_mysql://' fashion if at all possible with things such as sqlalchemy or even django-channels implementations.

njsmith · 2019-01-26T20:38:45Z

I assume by sans-io you mean something like anyio with a TRIO strategy?

Ah, no, that's a bit of jargon, sorry :-). Sans-io means splitting out the protocol code from the I/O entirely, so you have a pure in-memory protocol handler that just works on bytes: https://sans-io.readthedocs.io/

Maybe the best way to get a feel for this is to play around a bit with h11, maybe work through the tutorial or something: https://h11.readthedocs.io/en/latest/

[Sorry for the close, fat-fingered the button]

Ninpo · 2019-01-27T01:45:19Z

I assume by sans-io you mean something like anyio with a TRIO strategy?

Ah, no, that's a bit of jargon, sorry :-). Sans-io means splitting out the protocol code from the I/O entirely, so you have a pure in-memory protocol handler that just works on bytes: https://sans-io.readthedocs.io/

Maybe the best way to get a feel for this is to play around a bit with h11, maybe work through the tutorial or something: https://h11.readthedocs.io/en/latest/

[Sorry for the close, fat-fingered the button]

Having read that, if I understand https://sans-io.readthedocs.io/how-to-sans-io.html#integrating-with-i-o correctly we have a choice:

Making trio-mysql a sane mysql implementation that can send/receive bytes and leave the I/O operations to the underlying [a]sync library, a trio socket or a socket or an asyncio socket and have it work with any of those, or

Making trio-mysql a sane mysql implementation that can send/receive bytes but only concern itself with doing so via trio's IO subsystem a la aiohttp as mentioned on that page.

Do I understand that correctly? If that's the case I'd vote for the latter. I've not had a close look through PyMySQL yet but I'm hoping the handling of the bytes and the mysql protocol api are resuable.

njsmith · 2019-01-27T02:57:33Z

The general idea of sans-io is that you split off the mysql command parser/serializer into a standalone piece of code that doesn't deal with I/O at all. It's API is that it can answer questions like "hey, if I wanted to make this request, what bytes should I send?", or "hey, if I got these bytes from the server, what do they mean?"

One advantage of this is that it makes it pretty easy to plug in multiple I/O backends. And we know there are at least two interesting I/O backends – trio and synchronous :-). That's what I was getting at when I mentioned "reconciling with the sync version of the library" – maintaining two independent copies of the mysql parsing/serialization code is a waste of effort (and volunteer projects are always short on volunteers!), so it'd be good if we could eventually have just one copy of that code that's used for both trio and pymysql.

But even if that doesn't happen, the sans-I/O approach still has a lot of benefits. In particular, it makes it really easy to write really thorough tests for your parsing/serialization code, because you don't have to set up sockets and stuff like that, you can just hand it canned data and see what you get back.

That said, refactoring from a non-sans-I/O design to a sans-I/O design is itself a chunk of work, and in the short term would make this library diverge further from pymysql, so it may or may not be the best idea here. But in general it's pretty nice :-)

njsmith · 2019-01-27T08:51:23Z

We should also look at aiomysql, which depends on pymysql, so somehow they've found a way to re-use some of the code without forking... (from the README I suspect this partly involves cut-and-paste, but I haven't dug into the details, and it's worth looking at anyway)

(they're also good candidates to help with any sans-io-friendly refactoring :-))

Ninpo · 2019-01-31T13:59:41Z

https://dev.mysql.com/doc/refman/5.6/en/cursor-restrictions.html

This suggests MySQL does provide cursors and we're able to use fetchone/fetchmany as expected @smurfix

smurfix · 2019-01-31T20:11:42Z

Ah. Interesting. Does any python mysql DBI actually expose that feature?

Ninpo · 2019-01-31T22:58:18Z

From what I can tell about how it works, that's what the SSCursor/SSDictCursors in pymysql and mysqldb make use of. I've done some testing with it today actually and can confirm use of fetchmany to be much more memory efficient than fetchall. You're right it's not a cursor like you'd expect from a database like PostgreSQL, it's more like a buffered result (MEMORY or temp table created MySQL side).

…

On Thu, 31 Jan 2019 at 20:11, Matthias Urlichs ***@***.***> wrote: Ah. Interesting. Does any python mysql DBI actually expose that feature? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAS0UJUJKwmnDd7oRiOfapCwJhc5u0m2ks5vI03-gaJpZM4aGMsh> .

mhumpula · 2021-01-04T06:42:27Z

Although this is old, I've asked a PyMySQL what they think about it PyMySQL#912

njsmith closed this as completed Jan 26, 2019

njsmith reopened this Jan 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor to (a) get rid of cursors, (b) have a nicer interface #2

Refactor to (a) get rid of cursors, (b) have a nicer interface #2

smurfix commented Jan 17, 2019

njsmith commented Jan 26, 2019

smurfix commented Jan 26, 2019

Ninpo commented Jan 26, 2019

njsmith commented Jan 26, 2019 •

edited

Loading

Ninpo commented Jan 27, 2019

njsmith commented Jan 27, 2019

njsmith commented Jan 27, 2019

Ninpo commented Jan 31, 2019

smurfix commented Jan 31, 2019

Ninpo commented Jan 31, 2019 via email

mhumpula commented Jan 4, 2021

Refactor to (a) get rid of cursors, (b) have a nicer interface #2

Refactor to (a) get rid of cursors, (b) have a nicer interface #2

Comments

smurfix commented Jan 17, 2019

njsmith commented Jan 26, 2019

smurfix commented Jan 26, 2019

Ninpo commented Jan 26, 2019

njsmith commented Jan 26, 2019 • edited Loading

Ninpo commented Jan 27, 2019

njsmith commented Jan 27, 2019

njsmith commented Jan 27, 2019

Ninpo commented Jan 31, 2019

smurfix commented Jan 31, 2019

Ninpo commented Jan 31, 2019 via email

mhumpula commented Jan 4, 2021

njsmith commented Jan 26, 2019 •

edited

Loading