Bug 248123

Summary: [NEW PORT] textproc/edcount: Estimate distinct count of values on the command line
Product: Ports & Packages Reporter: Marcel Bischoff <marcel>
Component: Individual Port(s)Assignee: freebsd-ports-bugs (Nobody) <ports-bugs>
Status: Open ---    
Severity: Affects Only Me Keywords: feature
Priority: ---    
Version: Latest   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Patch none

Description Marcel Bischoff 2020-07-20 11:00:44 UTC
Created attachment 216601 [details]
Patch

Estimate distinct count of values from standard input. Provides a very fast way to perform unique count estimates on the command line.

The edcount program implements HyperLogLog, with some minor modifications, as detailed by by Flajolet et. al. in the paper "HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm".

Additionally, the memory footprint of the program is constant, at a few megabytes. This memory use is constant regardless of the number of records counted, and does not degrade in accuracy.

NOTE: this is my first attempt at a new port from scratch, please be kind.
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2020-07-20 12:07:53 UTC
Nice work and congratulations on your first port Marcel!

At a cursory glance it appears good, but the best way is to confirm that ones changes pass QA using our automated tools (portlint, poudriere at least), which pick up many issues

For details and instructions, see: 

https://www.freebsd.org/doc/en/books/porters-handbook/testing.html

If you need help and pointers getting these up and running, jump on #freebsd-ports on freenode IRC where there's plenty of people to support you

We also have a whole bunch of cheatsheets and checklists for Issue Management (and Ports Issues in particular, which are available here: 

https://wiki.freebsd.org/Bugzilla/

If you need any clarity on any of that, #freebsd-bugs or #freebsd-ports again for pointers/clarifications (and we'll up date the docs to improve them)
Comment 2 Marcel Bischoff 2020-07-20 13:34:40 UTC
Oh, I should have mentioned that I have already set up Poudriere to test the build as documented. This really streamlines the testing and gives helpful pointers on what to fix. :)
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2020-07-20 14:18:59 UTC
Ahh awesome :)

For future reports, you can mention that explicitly in your description:

portlint: OK (looks fine.)
testport: OK (poudriere: <versions>, <archs>, <OPTIONS> tested)
Comment 4 Marcel Bischoff 2020-07-20 15:50:59 UTC
Okay, so just add those lines to the comment?