Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the weekly summary report #6

Open
1 of 2 tasks
ezio-melotti opened this issue Aug 15, 2021 · 20 comments
Open
1 of 2 tasks

Replace the weekly summary report #6

ezio-melotti opened this issue Aug 15, 2021 · 20 comments
Labels
help wanted Extra attention is needed

Comments

@ezio-melotti
Copy link
Member

ezio-melotti commented Aug 15, 2021

The roundup-summary script script has been used to send weekly reports to the python-dev ML. The script is executed once a week by a cron job.

  • Find/write a replacement for the weekly summary report
  • Set up the replacement to send a weekly mail to python-dev

Adding a dashboard similar to the Django dashboard has been proposed on the TrackerDevelopmentPlanning wiki page and discussed. Custom views provided by GitHub issues could be a simpler alternative to the dashboard.

@ezio-melotti ezio-melotti added the help wanted Extra attention is needed label Aug 15, 2021
@terryjreedy
Copy link

My pydev answer:
Essential for my current non-IDLE contributions. I get this every Friday about noon US Eastern and usually use it for a few hours of tracker triage later that day and maybe into Saturday. I usually close at least one issue each week, as either rejected or fixed.

I prefer the single batch list because I can move it to a separate Thunderbird tab*, work in a new Firefox window, and dedicate a weekly period of time. I believe it is now latin-1 encoded and would prefer it to be uft-8.

  • One useful feature is that Tbird checks the browser catch and marks seen urls as such.

I can imagine that others prefer the ongoing emails.

@pxeger
Copy link

pxeger commented Sep 3, 2021

As a casual peruser of python-dev but not someone who regularly checks the issue tracker, these weekly emails give me a reason to look at some interesting bugs that I wouldn't otherwise know about.

@ezio-melotti
Copy link
Member Author

ezio-melotti commented Sep 7, 2021

After some feedback and a brief discussion with the Steering Council, It seems to me that:

  • Sending a mail with the weekly summary report seems important for quite a few people, so we should set up a replacement;
  • Building a custom dashboard is probably overkill;
  • With the new GitHub issues, it should also be possible to have custom views that:
    • can overlap with the sections of the summary (e.g. issues created in the last week);
    • can replace the suggested dashboard, offering an alternative workflow;
    • Correction: this only applies to projects (at least for the time being)

@trallard
Copy link
Member

I will be working on this issue as discussed in #9

@Harry-Lees
Copy link

Is this still being worked on? I'd like to help contribute to this effort if there's still things to do.

@ezio-melotti
Copy link
Member Author

I'm not sure if @trallard is still working on this, but a replacement for the weekly summary should still be implemented.

@Harry-Lees
Copy link

Where would a PR be created, I see that the existing script is in the bpo-tracker-python repository. Should a PR be created there?

I think getting all of the issue information should be available through the Github API, but I'll have a more thorough look tomorrow.

@ezio-melotti
Copy link
Member Author

ezio-melotti commented Feb 8, 2022

I think this could/should be implemented as a standalone GitHub Action that we can run weekly.
There are a few scripts/GHAs that need to be developed to replace some of the functionalities of Roundup, and we haven't decided yet where they would live: they could each have their own repo or they could be grouped into the same one. There are a few options:

  • It could be a simple script in python/cpython/.github/workflows, invoked in the same way that posix-deps-apt.sh is, so you could create a PR against python/cpython;
  • It could be a simple GitHub action in python/cpython/.github/actions, invoked by the workflows in python/cpython/.github/workflows
  • You could also create your own repo, and then transfer the ownership to the psf after the migration;
  • We could set up a new repo and you could create a PR against that (but we would need to discuss whether we need a separate repo for that);
  • We could also add the code to this repo and rename/repurpose it at the end of the migration.

@trallard
Copy link
Member

trallard commented Feb 9, 2022

Hey @ezio-melotti and @Harry-Lees - I started working on this and then things got super busy at work so this fell a bit through the cracks. I can try and finish and send a PR by the end of the week/early next week.

I was planning to send the PR against this repo and have it moved to where it needs to in the end.

Thoughts?

@Harry-Lees
Copy link

I've begun writing a script for this at https://github.com/Harry-Lees/RoundupSummary. I've been testing off the issues in this repository https://github.com/python/issues-test-demo-20220218/issues.

There were a couple of questions I had about this that I was wondering if you might be able to shine some light on @ezio-melotti.

  • Are there any plans to include labels for things like "needs review"? I looked through the labels on the test repository and it doesn't look like there's anything there currently. On the bpo script there are keywords for these things, but as far as I can tell there's no easy way to tell on the new system.

  • Is there a database of issues that we can store locally, or can we create one? If this script is running completely in a stateless environment, this might not be possible. As far as I can tell, there isn't a way to easily get a count of all issues without just asking the Github API for all of the issues. The API limits the maximum number of issues returned to pages of 100. As of the latest summary email there were 58,714 total issues on BPO meaning we'd have to query the API 588 times to get all of those issues. This isn't necessarily an issue as the Github API rate limits are significantly higher than this, but if we had something like a local SQLite3 database we could reduce that significantly to only track changes.

I don't want to pollute this thread too much, so I'll just keep the repository README updated with progress.

@ezio-melotti
Copy link
Member Author

Are there any plans to include labels for things like "needs review"?

Labels are discussed in #5, which I'm planning to update soon. needs review won't be included because it's already tracked by GitHub's pull requests.

Is there a database of issues that we can store locally, or can we create one?

I guess one way is exporting all the issues, and then filter the raw JSONs. If you find a way to do it in a GitHub action through the GitHub API it would be better.

@ezio-melotti
Copy link
Member Author

We are going to migrate to GitHub this weekend.
@Harry-Lees, what's the status on the summary script?

@Harry-Lees
Copy link

We are going to migrate to GitHub this weekend. @Harry-Lees, what's the status on the summary script?

Congratulations on the successful migration! I apologise for the slow progress, it's been a bit hectic with the end of University term :).

Currently the script is functional, but incomplete. it will generate an HTML file which looks like this:

<h1>ACTIVITY SUMMARY</h1>
<p>(2022-04-04 - 2022-04-11)</p>
<p>Python tracker at <a href="https://github.com/python/cpython/issues">https://github.com/python/cpython/issues</a></p>

<p>To view or respond to any of the issues listed below, click on the issue.
    Do NOT respond to this message.</p>

<p>Issues stats:</p>
<table border="1">
    <tr>
        <th>open</th>
        <td>7145 (+30)</td>
    </tr>
    <tr>
        <th>closed</th>
        <td>51857 (+68)</td>
    </tr>
    <tr>
        <th>total</th>
        <td>59002 (-38)</td>
    </tr>
</table>

<p>Open issues with patches: 0</p>

<p>Issues Opened (30)</p>
=======================================
<p>#91431: Docs for `re` inaccurately indicate support for non-capturing inline modifiers<br><a href="https://github.com/python/cpython/issues/91431">https://github.com/python/cpython/issues/91431</a> opened by ofek</p>
<p>#91428: Add opname array to `opcode.h` of debug builds.<br><a href="https://github.com/python/cpython/issues/91428">https://github.com/python/cpython/issues/91428</a> opened by sweeneyde</p>
<p>#91427: Out-of-date links to Python 3 docs from docs.python.org/2<br><a href="https://github.com/python/cpython/issues/91427">https://github.com/python/cpython/issues/91427</a> opened by Zac-HD</p>

<p>Issues Closed (68)</p>
=======================================
<p>#91424: reference to the previous issue tracker in the main readme<br><a href="https://github.com/python/cpython/issues/91424">https://github.com/python/cpython/issues/91424</a> opened by royreznik</p>
<p>#91416: os.closerange() can be no-op in a seccomp sandbox<br><a href="https://github.com/python/cpython/issues/91416">https://github.com/python/cpython/issues/91416</a> opened by Mannequin</p>
<p>#91413: add methods to get first and last elements of a range<br><a href="https://github.com/python/cpython/issues/91413">https://github.com/python/cpython/issues/91413</a> opened by Mannequin</p>

(I've removed most of the actual issues for brevity).

Currently the things that need fixing are:

  • Sending the emails, this functionality was basically abstracted away by the roundup library in the current roundup-summary script, so I'm not entirely sure what to do on this front, I assume we can set some private variables in the cpython Github repo which will be passed as env variables to authenticate the email account, but I'm not sure where the list of recipients comes from.
  • "Most recent 15 issues waiting for review" field is currently not included, I don't think this should be too hard to implement now I know my way around the Github API a bit better.
  • "Open issues with patches", similar again to the recent issues for review. This should be a very simple GraphQL query.
  • Most of the issues will have "Opened by Mannequin", I think over time this will naturally filter out, but there might be a way of fixing it in the short term just incase.

I've added a rudimentary workflow script to my test repository with the aim of making the entire script run as a Github action as you suggested, I haven't thoroughly tested it yet though.

The issue I raised earlier with the API calls was resolved nicely by using a combination of the Github GraphQL, and REST APIs allowing us to make significantly less API calls (only 3 calls are made assuming there are less than 100 issues created in a week).

Overall, assuming the email sending isn't a hard fix, I think it could be finished by the end of this week.

@ezio-melotti
Copy link
Member Author

Thanks for looking into this!

Sending the emails, this functionality was basically abstracted away by the roundup library in the current roundup-summary script, so I'm not entirely sure what to do on this front

We are currently testing an action that sends emails. You can find the source here https://github.com/python/cpython/blob/main/.github/workflows/new-bugs-announce-notifier.yml . This is based on #7 (comment)
In theory we could also run this on the same server where we running the original roundup-summary if turning it into an action is too much work.

@ewdurbin do you have any suggestion about this?

Most of the issues will have "Opened by Mannequin"

If you look at the issue list, the author of the issue is visible even if it's a mannequin, and mannequin is just a "tag" next to the name. Perhaps there is another field that contains the name?

@Harry-Lees
Copy link

Harry-Lees commented Apr 10, 2022

You can find the source here

Thanks, I'll take a look at that!

the author of the issue is visible even if it's a mannequin

I did see this, but weirdly the Github API doesn't seem to include that in their API (At least not the REST API). For example this issue python/cpython#91413 will return this data:

"user": {
        "login": "91e69f45-91d9-4b12-87db-a02908296c81",
        "id": 92819654,
        ...
        "type": "Mannequin",
        "site_admin": false
},

Where regularly the "login" field would just be the username.

There are a couple of potential fixes but it depends on how much of an issue you think it is.

  • I could look at the GraphQL endpoint to see if there's any more data available there, I think it's unlikely but it could be possible. I might've spoken too soon, it does look like it may be possible (https://docs.github.com/en/graphql/reference/objects#mannequin).
  • Since the body of the Issue includes all the fields from the bpo, I can grab the username from there on issues that have been migrated across (this obviously won't apply to issues created on Github).

@Harry-Lees
Copy link

I might've spoken too soon, it does look like it may be possible (https://docs.github.com/en/graphql/reference/objects#mannequin).

I managed to implement this functionality last night. I think the only thing left to do is send the emails.

What would be the nicest way to do this from your end? I noticed in the new-bugs-announce script, you're using a template which has been pre-uploaded to Mailgun, would it be possible to do a similar thing here? From my perspective this would make the actual roundup_summary.py script shorter as it'd offload a lot of the templating work to Mailgun. Currently there's no external dependencies, so without using Jinja2 or similar, it's a lot of messy string formatting to get a nice looking template.

If this isn't an option, the script does currently generate an HTML file anyway which I think can still be sent over the Mailgun API.

Finally, would there be some way of testing this? Just sending an email to myself, or someone else could test the final part as you have access to all the necessary accounts?

@ezio-melotti
Copy link
Member Author

Finally, would there be some way of testing this? Just sending an email to myself, or someone else could test the final part as you have access to all the necessary accounts?

If we use GitHub actions, there is a way to triggered them manually. It's also possible to pass arguments using something like:

on:
  workflow_dispatch:
    inputs:
      email:
        description: 'The email that will receive the report:'
        required: true
        default: '[email protected]'

Once the action is in place we could use this to test it and if it works we can schedule it to run every week.

If instead we decide to host/run it somewhere else, then we don't need this, but I'm not sure what would be the best approach.

@terryjreedy
Copy link

Since I am a consumer of the summary reports, you can send test emails to me also, preferably with a link back to this or other issue for comment.

@pxeger
Copy link

pxeger commented May 15, 2022

As discussed on python-dev, the weekly bpo roundup email is still being sent, despite "0 new issues". Can we shut that down?

@ezio-melotti
Copy link
Member Author

Thanks for the ping, I now replied to the thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants