Bug 239731 - Duplicate stories without apparent cause in the Phabricator feed
Summary: Duplicate stories without apparent cause in the Phabricator feed
Status: New
Alias: None
Product: Services
Classification: Unclassified
Component: Code Review (show other bugs)
Version: unspecified
Hardware: Any Any
: --- Affects Only Me
Assignee: Phabric Admin
URL: https://reviews.freebsd.org/feed/
Keywords:
Depends on:
Blocks:
 
Reported: 2019-08-09 05:22 UTC by Tobias Kortkamp
Modified: 2019-08-20 09:47 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Kortkamp freebsd_committer 2019-08-09 05:22:12 UTC
Hi,

on https://reviews.freebsd.org/feed/ there are many duplicate stories
like "jbeich edited P294" where a new one appears every 5 minutes
or so.  However https://reviews.freebsd.org/P294 only shows a single
edit.  This has been going on for ~2 days.  Basically the feed is
unusable at the moment because it is now full of spam.

Could somebody look into it?  Maybe whack whatever cron job(?) is
doing this?

Thank you.
Comment 1 Kurt Jaeger freebsd_committer 2019-08-20 09:47:17 UTC
- Debugging ?
  ssh rev
  mysql
  use phabricator_feed;
  to get a feeling about the data set size:
  select count(*) from feed_storydata;
  471994
  select count(*) from feed_storynotification;
  129769
  select count(*) from feed_storyreference;
  892478
- do we find patterns in the dump ?
  mysqldump phabricator_feed > feed.sql
  nope.
- look at it like this:
  select phid,storyData,dateCreated from feed_storydata where storyData like '%jbeich%' and dateCreated > 1500000000;
  jbeich == authorPHID = 'PHID-USER-w3txgruty3z35mqrlu3w'
   select phid,storyData,authorPHID from feed_storydata where authorPHID = 'PHID-USER-w3txgruty3z35mqrlu3w';
- select always returns a max of 1000 results ?
  so use the timestamp to select the latest events
  select storyData,phid,dateCreated,authorPHID from feed_storydata where authorPHID = 'PHID-USER-w3txgruty3z35mqrlu3w' and dateCreated > 1566000000 order by dateCreated;
- we see: approx. 300sec between those recurring events.
  What might be the source of those events ?
  cron ? does not seem to have it.
- does it come via http ? /var/log/nginx-access.log
  how to look at the nginx logfile
  cd /var/log
  setenv LESS -Mdec
  less nginx-access.log
  does not show recurring events (using a rough visual check)
- Do we see something if we look at the event/change ?
  https://reviews.freebsd.org/P294
  That's a paste, so:
  use phabricator_pastebin;
  select id,phid,authorPHID from pastebin_paste;
  There aren't that many pastes (300)
  select id,phid,authorPHID from pastebin_paste where authorPHID = 'PHID-USER-w3txgruty3z35mqrlu3w';
  select * from pastebin_paste where id=294;
  is the paste we're seeing in the feed, and it has
  PHID-FILE-e5xsfnqyp56sqbie7ji6
  use phabricator_file;
  select * from file where phid = 'PHID-FILE-e5xsfnqyp56sqbie7ji6';
  has:
  storageEngine local-disk
  storageFormat raw
  storageHandle 0c/7e/727a8a6ed5c59f55162c1be24a51
  which translates to:
  /var/phabricator/large-files/0c/7e/727a8a6ed5c59f55162c1be24a51
- there are many files in
  /var/phabricator/large-files/
  approx. 160K files
  using 200 GB (there's place for 2.4 TB)
- size etc is not uncommon, so, it's probably not the paste itself
- There are daemon processes inside phabric:
  ps ax | grep exec_daem
65599  -  SsJ     0:00.32 php ./exec_daemon.php PhabricatorTaskmasterDaemon
89562  -  SsJ   147:25.17 php ./exec_daemon.php PhabricatorRepositoryPullLocalD
89564  -  SsJ     3:12.00 php ./exec_daemon.php PhabricatorFactDaemon
- transaction type:
  PhabricatorApplicationTransactionFeedStory
- Is it one of those processes ?
  where do we find the code ?
  cd /usr/local/www/phabricator/phabricator/src/infrastructure/daemon/workers/
  ls -l PhabricatorTaskmasterDaemon*
  cd /usr/local/www/phabricator/phabricator/src/applications/fact/daemon/
  ls -l PhabricatorFactDaemon.php
- search for it ?
  cd usr/local/www/phabricator/phabricator/src
  find . -type f -exec grep Feed {} /dev/null \; |wc -l
  has 673 times 
- analysis looks non-trivial: restart/reboot phab instance ?