Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update jupyter_ydoc and pycrdt_websocket dependencies #367

Merged
merged 7 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 31 additions & 27 deletions projects/jupyter-server-ydoc/jupyter_server_ydoc/handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@
import time
import uuid
from logging import Logger
from typing import Any
from typing import Any, Literal
from uuid import uuid4

from jupyter_server.auth import authorized
from jupyter_server.base.handlers import APIHandler, JupyterHandler
from jupyter_server.utils import ensure_async
from jupyter_ydoc import ydocs as YDOCS
from pycrdt import Doc, UndoManager, YMessageType, write_var_uint
from pycrdt import Doc, UndoManager, write_var_uint
from pycrdt_websocket.websocket_server import YRoom
from pycrdt_websocket.ystore import BaseYStore
from tornado import web
Expand Down Expand Up @@ -137,6 +137,10 @@ def exception_logger(exception: Exception, log: Logger) -> bool:
exception_handler=exception_logger,
)

if self._room_id == "JupyterLab:globalAwareness":
# Listen for the changes in GlobalAwareness to update users
self.room.awareness.observe(self._on_global_awareness_event)

try:
await self._websocket_server.start_room(self.room)
except Exception as e:
Expand Down Expand Up @@ -286,31 +290,6 @@ async def on_message(self, message):
"""
message_type = message[0]

if message_type == YMessageType.AWARENESS:
# awareness
skip = False
changes = self.room.awareness.get_changes(message[1:])
added_users = changes["added"]
removed_users = changes["removed"]
for i, user in enumerate(added_users):
u = changes["states"][i]
if "user" in u:
name = u["user"]["name"]
self._websocket_server.connected_users[user] = name
self.log.debug("Y user joined: %s", name)
for user in removed_users:
if user in self._websocket_server.connected_users:
name = self._websocket_server.connected_users[user]
del self._websocket_server.connected_users[user]
self.log.debug("Y user left: %s", name)
# filter out message depending on changes
if skip:
self.log.debug(
"Filtered out Y message of type: %s",
YMessageType(message_type).name,
)
return skip

if message_type == MessageType.CHAT:
msg = message[2:].decode("utf-8")

Expand Down Expand Up @@ -405,6 +384,31 @@ async def _clean_room(self) -> None:
self._emit(LogLevel.INFO, "clean", "Loader deleted.")
del self._room_locks[self._room_id]

def _on_global_awareness_event(
self, topic: Literal["change", "update"], changes: tuple[dict[str, Any], Any]
) -> None:
"""
Update the users when the global awareness changes.

Parameters:
topic (str): `"update"` or `"change"` (`"change"` is triggered only if the states are modified).
changes (tuple[dict[str, Any], Any]): The changes and the origin of the changes.
"""
if topic != "change":
return
added_users = changes[0]["added"]
removed_users = changes[0]["removed"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "updated" users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I wonder if the connected_users of the _websocket_server is even used.
I copied that code from the previous message handler, but I don't know if we need that function.
If we keep it, we should indeed handle the "updated" users (removing the former name and add the new one).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep it, we should indeed handle the "updated" users (removing the former name and add the new one).

Thinking again about that, we don't have the previous name, the changes contain only the client ids.
We should rebuild the full list from the global awareness when a user is updated, because we don't know if the user name has been updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the goal of this code was to check in the backend that the awareness information from the frontend is correct. For instance, we don't want a student to take the user name of the teacher. That's why there is this skip variable that was supposed to filter out an awareness message, but this was never put to actual use.
I'm wondering how we can filter out a message with your changes though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we can filter out a message with the current information we have.
For example when you reload the page, it adds a new client id with the same user information. The old client id is removed in a second step (probably when a client has a lack of update from it ?).
The same user is duplicated over a period of time, with 2 different client IDs. If we allow this behavior, I don't know how we can filter out someone trying to cheat.

The following image shows some logs when a remote client reloads the page. When a change is received, the change and the current users in the awareness are printed.
Screenshot from 2024-10-15 16-10-45

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic to filter out messages is here already, but how can we connect it to the changes here?

I saw that part, but on my side it is never reached, self.on_message is never set. I don't understand what was the expected logic here and what should be in the self.on_message function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I never finished the work, but self.on_message should be a callback that would basically be this.

Copy link
Contributor Author

@brichet brichet Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not apply the awareness in the first place if it's not correct.

This is what I meant with "the simplest is probably to do it in the YRoom directly, before updating the awareness"

I think that what's wrong is that with this PR, we apply the awareness updates coming from clients to the awareness in the backend, and then we observe the changes and decide if the updates should be forwarded to other clients, but it's already too late, right?

For what I understand the message is forwarded to the clients at the same time the awareness in the backend is updated, here, and then the user list is updated in the _websocket_server.
But probably I miss something...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually used in jupyverse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what I understand the message is forwarded to the clients at the same time the awareness in the backend is updated, here, and then the user list is updated in the _websocket_server.

Yes but we should not execute this code if the awareness update is "invalid".

for user in added_users:
u = self.room.awareness.states[user]
if "user" in u:
name = u["user"]["name"]
self._websocket_server.connected_users[user] = name
self.log.debug("Y user joined: %s", name)
for user in removed_users:
if user in self._websocket_server.connected_users:
name = self._websocket_server.connected_users.pop(user)
self.log.debug("Y user left: %s", name)

def check_origin(self, origin):
"""
Check origin
Expand Down
2 changes: 1 addition & 1 deletion projects/jupyter-server-ydoc/jupyter_server_ydoc/rooms.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def __init__(
self._file_format: str = file_format
self._file_type: str = file_type
self._file: FileLoader = file
self._document = YDOCS.get(self._file_type, YFILE)(self.ydoc)
self._document = YDOCS.get(self._file_type, YFILE)(self.ydoc, self.awareness)
self._document.path = self._file.path

self._logger = logger
Expand Down
4 changes: 2 additions & 2 deletions projects/jupyter-server-ydoc/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ authors = [
]
dependencies = [
"jupyter_server>=2.11.1,<3.0.0",
"jupyter_ydoc>=2.0.0,<4.0.0",
"jupyter_ydoc>=2.1.2,<4.0.0",
"pycrdt",
"pycrdt-websocket>=0.14.2,<0.15.0",
"pycrdt-websocket>=0.15.0,<0.16.0",
"jupyter_events>=0.10.0",
"jupyter_server_fileid>=0.7.0,<1",
"jsonschema>=4.18.0"
Expand Down
Loading