-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor to (a) get rid of cursors, (b) have a nicer interface #2
Comments
Maybe if there's major refactoring happening, it would also be good to refactor into a sans-io protocol handler + a trio adapter, so that there's a path for eventually reconciling with the sync version of the library? |
Well, that would depend on the sync version to convert to sans-io. Unfortunately I have too much on my plate to even start a discussion about that. :-/ |
Are there any working examples you're aware of I could use as a reference? I assume by sans-io you mean something like anyio with a TRIO strategy? I'd be happy to look at that as part of the refactor. Which brings me to the discussion that started here: #3 Having considered the points @smurfix made at the outset, I completely agree that there's no point bringing bad habits/incorrect implementations into a new library (Trio) that's trying to do exactly the opposite of that in regards to the async world. To that end I don't see a reason for trio-mysql to not be a complete fork of the pymysql project and thus start versioning again as we see fit in line with the rest of the Trio project (semver I believe?). I don't see it being too difficult to be able to backport any upstream fixes that make sense in the future but I don't think maintaining compatibility with PyMySQL is necessarily a concern anymore considering one of the desires of our refactor is to implement a sane query API, which in MySQL land does not implement cursors or buffered results and therefore a cursor class makes no sense and methods such as fetchone/fetchmany become subject for debate. We should probably discuss, with things such as fetchone/fetchmany whether they should at least not exist, to at most return some sort of NotImplemented, or if we should see if there's a way we can implement them sanely for any ORMs that may expect them to be there...is it a DBAPI 2.0 requirement? I shall read up on the DBAPI 2.0 spec. SQLAlchemy core already provides a conn.execute("STMT") style syntax so it's not without precedent. It should be a goal that we can be used in a 'mysql+trio_mysql://' fashion if at all possible with things such as sqlalchemy or even django-channels implementations. |
Ah, no, that's a bit of jargon, sorry :-). Sans-io means splitting out the protocol code from the I/O entirely, so you have a pure in-memory protocol handler that just works on bytes: https://sans-io.readthedocs.io/ Maybe the best way to get a feel for this is to play around a bit with h11, maybe work through the tutorial or something: https://h11.readthedocs.io/en/latest/ [Sorry for the close, fat-fingered the button] |
Having read that, if I understand https://sans-io.readthedocs.io/how-to-sans-io.html#integrating-with-i-o correctly we have a choice: Making trio-mysql a sane mysql implementation that can send/receive bytes and leave the I/O operations to the underlying [a]sync library, a trio socket or a socket or an asyncio socket and have it work with any of those, or Making trio-mysql a sane mysql implementation that can send/receive bytes but only concern itself with doing so via trio's IO subsystem a la aiohttp as mentioned on that page. Do I understand that correctly? If that's the case I'd vote for the latter. I've not had a close look through PyMySQL yet but I'm hoping the handling of the bytes and the mysql protocol api are resuable. |
The general idea of sans-io is that you split off the mysql command parser/serializer into a standalone piece of code that doesn't deal with I/O at all. It's API is that it can answer questions like "hey, if I wanted to make this request, what bytes should I send?", or "hey, if I got these bytes from the server, what do they mean?" One advantage of this is that it makes it pretty easy to plug in multiple I/O backends. And we know there are at least two interesting I/O backends – trio and synchronous :-). That's what I was getting at when I mentioned "reconciling with the sync version of the library" – maintaining two independent copies of the mysql parsing/serialization code is a waste of effort (and volunteer projects are always short on volunteers!), so it'd be good if we could eventually have just one copy of that code that's used for both trio and pymysql. But even if that doesn't happen, the sans-I/O approach still has a lot of benefits. In particular, it makes it really easy to write really thorough tests for your parsing/serialization code, because you don't have to set up sockets and stuff like that, you can just hand it canned data and see what you get back. That said, refactoring from a non-sans-I/O design to a sans-I/O design is itself a chunk of work, and in the short term would make this library diverge further from pymysql, so it may or may not be the best idea here. But in general it's pretty nice :-) |
We should also look at aiomysql, which depends on pymysql, so somehow they've found a way to re-use some of the code without forking... (from the README I suspect this partly involves cut-and-paste, but I haven't dug into the details, and it's worth looking at anyway) (they're also good candidates to help with any sans-io-friendly refactoring :-)) |
https://dev.mysql.com/doc/refman/5.6/en/cursor-restrictions.html This suggests MySQL does provide cursors and we're able to use fetchone/fetchmany as expected @smurfix |
Ah. Interesting. Does any python mysql DBI actually expose that feature? |
From what I can tell about how it works, that's what the
SSCursor/SSDictCursors in pymysql and mysqldb make use of.
I've done some testing with it today actually and can confirm use of
fetchmany to be much more memory efficient than fetchall. You're right
it's not a cursor like you'd expect from a database like PostgreSQL, it's
more like a buffered result (MEMORY or temp table created MySQL side).
…On Thu, 31 Jan 2019 at 20:11, Matthias Urlichs ***@***.***> wrote:
Ah. Interesting. Does any python mysql DBI actually expose that feature?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAS0UJUJKwmnDd7oRiOfapCwJhc5u0m2ks5vI03-gaJpZM4aGMsh>
.
|
Although this is old, I've asked a PyMySQL what they think about it PyMySQL#912 |
The problem is that mysql doesn't have cursors, so why would you want to fake them?
In an async world you don't need to expose cursors to the user anyway. You want to simply call
instead. SO, none of that
fetch_one
,fetch_many
orfetch_dict
stuff – let the record be a named tuple instead, and/or add some options that describe how the result should look like.If
process
needs to run a query (like an update), well, you need to save the not-yet-read results in memory – which you should not do when that is not necessary.Also, cancelling a query requires sending a "kill" command to the server if not all results have been read (if any). The mysql command line client can do that: if you interrupt a query before it returns any result it asks you whether you want to.
Also, mysql now supports nested transactions. This adapter should support them. It should also refuse to accept queries on the connection (or outer transactions) when a(n inner) transaction is active, i.e.
The text was updated successfully, but these errors were encountered: