Created attachment 147395 [details] port shar Apache Spark is a fast and general engine for large-scale data processing. Spark runs programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing. You can write applications quickly in Java, Scala or Python. Spark powers a stack of high-level tools including Spark SQL, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these frameworks seamlessly in the same application. If you have a Hadoop 2 cluster, you can run Spark without any installation needed. Otherwise, Spark is easy to run standalone or on EC2 or Mesos. It can read from HDFS, HBase, Cassandra, and any Hadoop data source. WWW: http://spark.apache.org/
If I see it correctly, the spark engine requires Java to be installed, but it only has a build dependency on maven. It would make sense to add the openjdk 1.7 as RUN_DEPENDS and BUILD_DEPENDS to have a (fully) functional port.
Created attachment 147452 [details] spark shar added NO_ARCH=yes, added RUN_DEPENDS on JAVA 1.7.
Created attachment 147677 [details] spark port shar i discovered that spark needs hadoop shared lib a runtime.
Can the port be renamed from spark to apache_spark? I've been working on an unrelated spark port, lang/spark, see http://www.spark-2014.org/ , for a few months I've hit technical snags which caused the delay, but in any case devel/spark would definitely be confused with lang/spark (as spark-2014 could also legitimately be put in devel category). Let's avoid ambiguity because it occurs!
Few comments on port: 1) I find 1-screen sized copyright in startup scripts redundant, no other ports include them. 2) JAVA_VENDOR, HAVA_VERSION variables in startup scripts are not used and not needed 3) Hardcoded "/usr/local/share/spark/sbin" in start_worker.in 4) Extra dependency on sbt which is not needed, there is a documented procedure of building spark with maven: https://spark.apache.org/docs/1.1.0/building-with-maven.html 5) hadoop is runtime dependency, so no need to list it as LIB_DEPENDS 6) Daemons do not require root privileges to run, so it is better to use separate pseudo-user to start them. 7) It is wise to pre-build maven dependencies and fetch them as tar-file, so build cluster does not download 250MB on each build. I created the same port independently (did not noticed your submission), so I attach my work here for reference. I don't really care whose version will be committed, just want the port to be in good shape before this happens.
Created attachment 147883 [details] my version of spark port
(In reply to Dmitry Sivachenko from comment #6) > Created attachment 147883 [details] > my version of spark port I know the shar existed before - that said, the same request applies to set PKGNAMEPREFIX to "apache-". "Apache Spark" is even trademark, so there is precedent (http://spark.apache.org/)
(In reply to John Marino from comment #7) > (In reply to Dmitry Sivachenko from comment #6) > > Created attachment 147883 [details] > > my version of spark port > > > I know the shar existed before - that said, the same request applies to set > PKGNAMEPREFIX to "apache-". > > "Apache Spark" is even trademark, so there is precedent > (http://spark.apache.org/) No objection from my side.
you are violated my copyright rights by removing copyright statements them from my rc.d scripts. Write your own
(In reply to Radim Kolar from comment #9) > you are violated my copyright rights by removing copyright statements them > from my rc.d scripts. > > Write your own Radim, this is his quote, "I created the same port independently (did not noticed your submission), so I attach my work here for reference." That means he didn't use your rc.d, he wrote his own. I think an apology is in order.
he is lying. Check his scripts and mine. He asked me by email for permission to remove my copyrights from my scripts and suggested some minor changes. I didn't gave him my permission, he removed it anyway.
It's like 25 lines long. Is it even standard practice to add copyrights (and licenses) to RC scripts in ports? This legal spat is going to ensure it's never committed. Who wants to deal with this stuff? I don't. If you want credit, I'd think the permanent "# Created by:" line would be enough.
as long as the title is getting tweaked, might as well use the suggested port name. PR is still stuck though...
I'm moving this out of triage. Nobody has made any effort to resolve the tiff and nobody else wants to step in, thus a stalemate.