Bug 69086

Summary: Porters Handbook: How to convert from CR/LF to LF using REINPLACE_CMD
Product: Documentation Reporter: Alexey Dokuchaev <danfe>
Component: Books & ArticlesAssignee: freebsd-doc (Nobody) <doc>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Latest   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description Alexey Dokuchaev 2004-07-15 08:00:44 UTC
Quite often, we have to port sotfware written for or under Windows/DOS,
which use dumb CR/LF convention for text files instead of Unix plain LF.
This often causes problems with further patching, compiler warning,
scipts execution (/bin/sh^M not found), etc.

Since there's no standard practice of dealing with such situations,
people often come up with all sorts of solutions, i.e. supplying
explicit patches in files/, adding dependency(!!!) on dos2unix, of
wrapping tr -d '\r' in a script and calling it instead of sed(1) in my
proposal, which is simple and straightforward.

I therefore suggest including this in Porter's Handbook, as in attached
patch.

How-To-Repeat: Try porting some fairly complex CR/LF code.
Comment 1 Giorgos Keramidas freebsd_committer freebsd_triage 2004-07-18 03:04:00 UTC
On 2004-07-15 14:01, Alexey Dokuchaev <danfe@regency.nsu.ru> wrote:
> I therefore suggest including this in Porter's Handbook, as in
> attached patch.

This looks like a very nice way to filter the sources to me.  I'm not
very experienced with the way our ports system works though.

Do the people on freebsd-ports agree about this change?  In particular,
since he seems to be the most active ports committer that updates the
porters-handbook, does David O'Brien agree with this addition?

> Index: book.sgml
> ===================================================================
> RCS file: /home/pub/ftp/pub/FreeBSD/development/FreeBSD-CVS//doc/en_US.ISO8859-1/books/porters-handbook/book.sgml,v
> retrieving revision 1.461
> diff -u -r1.461 book.sgml
> --- book.sgml	12 Jul 2004 08:24:15 -0000	1.461
> +++ book.sgml	15 Jul 2004 06:55:26 -0000
> @@ -681,6 +681,24 @@
>  	  lines!); define <literal>USE_AUTOCONF_VER=213</literal> and take the
>  	  diffs of <filename>configure.in</filename>.</para>
>  
> +	<para>Quite often, there is a situation when ported software, being
> +	  primarily developed on Windows, uses CR/LF convention for most of its
> +	  source files.  This may cause problems with further patching, compiler
> +	  warnings, scipts execution (<command>/bin/sh^M</command> not found),
> +	  etc.  To quickly convert those files from CR/LF to just LF, you can do
> +	  something like this:</para>
> +
> +	<programlisting>USE_REINPLACE=	yes
> +
> +post-extract:
> +	@${FIND} -E ${WRKDIR} -type f -iregex ".*\.(c|cpp|h|txt)" \
> +		-exec ${REINPLACE_CMD} -e 's/[[:cntrl:]]*$$//' '{}' \;</programlisting>
> +
> +	<para>Of course, if you need to process each and every file,
> +	  <option>-iregex</option> above can be omitted.  Be aware that this
> +	  piece of code will strip all trailing control characters from each
> +	  line of processed file (except <literal>\n</literal>).</para>
> +
>  	<para>Also, if you had to delete a file, then you can do it in the
>  	  <maketarget>post-extract</maketarget> target rather than as part of
>  	  the patch.  Once you are happy with the resulting diff, please split
Comment 2 David E. O'Brien freebsd_committer freebsd_triage 2004-08-02 17:04:19 UTC
On Sun, Jul 18, 2004 at 05:04:00AM +0300, Giorgos Keramidas wrote:
> On 2004-07-15 14:01, Alexey Dokuchaev <danfe@regency.nsu.ru> wrote:
> > I therefore suggest including this in Porter's Handbook, as in
> > attached patch.
> 
> This looks like a very nice way to filter the sources to me.  I'm not
> very experienced with the way our ports system works though.
> 
> Do the people on freebsd-ports agree about this change?  In particular,
> since he seems to be the most active ports committer that updates the
> porters-handbook, does David O'Brien agree with this addition?

I'm really not the most active ports committer updating the
porters-handbook these days.  I personally have no problems with the
proposed changed.

-- 
-- David  (obrien@FreeBSD.org)
Comment 3 Peter Pentchev 2004-08-06 14:43:47 UTC
On Thu, Jul 15, 2004 at 02:01:05PM +0700, Alexey Dokuchaev wrote:
> 
> >Number:         69086
> >Category:       docs
> >Synopsis:       Porters Handbook: How to convert from CR/LF to LF using REINPLACE_CMD

What do you think about the following patch, which advocates a bit more
efficient method (find/xargs will invoke REINPLACE_CMD an order of
magnitude less than invoking it for each and every file, or even on many
files in succession :), and also has some minor corrections and
rewording to the text above?

G'luck,
Peter

Index: doc/en_US.ISO8859-1/books/porters-handbook/book.sgml
===================================================================
RCS file: /home/ncvs/doc/en_US.ISO8859-1/books/porters-handbook/book.sgml,v
retrieving revision 1.470
diff -u -r1.470 book.sgml
--- doc/en_US.ISO8859-1/books/porters-handbook/book.sgml	5 Aug 2004 10:04:30 -0000	1.470
+++ doc/en_US.ISO8859-1/books/porters-handbook/book.sgml	6 Aug 2004 13:37:55 -0000
@@ -682,6 +682,25 @@
 	  lines!); define <literal>USE_AUTOCONF_VER=213</literal> and take the
 	  diffs of <filename>configure.in</filename>.</para>
 
+	<para>Quite often, there is a situation when the software being
+	  ported, especially if it is primarily developed on Windows, uses
+	  the CR/LF convention for most of its source files.  This may cause
+	  problems with further patching, compiler warnings, scripts
+	  execution (<command>/bin/sh^M</command> not found), etc.  To
+	  quickly convert those files from CR/LF to just LF, you can do
+	  something like this:</para>
+
+	<programlisting>USE_REINPLACE=	yes
+
+post-extract:
+	@${FIND} -E ${WRKDIR} -type f -iregex ".*\.(c|cpp|h|txt)" -print0 | \
+		${XARGS} -0 ${REINPLACE_CMD} -e 's/[[:cntrl:]]*$$//' '{}' \;</programlisting>
+
+	<para>Of course, if you need to process each and every file,
+	  <option>-iregex</option> above can be omitted.  Be aware that this
+	  piece of code will strip all trailing control characters from each
+	  line of processed file (except <literal>\n</literal>).</para>
+
 	<para>Also, if you had to delete a file, then you can do it in the
 	  <maketarget>post-extract</maketarget> target rather than as part of
 	  the patch.  Once you are happy with the resulting diff, please split

-- 
Peter Pentchev	roam@ringlet.net    roam@cnsys.bg    roam@FreeBSD.org
PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
If I were you, who would be reading this sentence?
Comment 4 Peter Pentchev 2004-08-06 16:45:14 UTC
On Fri, Aug 06, 2004 at 04:43:47PM +0300, Peter Pentchev wrote:
> On Thu, Jul 15, 2004 at 02:01:05PM +0700, Alexey Dokuchaev wrote:
> > 
> > >Number:         69086
> > >Category:       docs
> > >Synopsis:       Porters Handbook: How to convert from CR/LF to LF using REINPLACE_CMD
> 
> What do you think about the following patch, which advocates a bit more
> efficient method (find/xargs will invoke REINPLACE_CMD an order of
> magnitude less than invoking it for each and every file, or even on many
> files in succession :), and also has some minor corrections and
> rewording to the text above?

Or how about the following, which uses the &windows; entity properly? :)

G'luck,
Peter

Index: doc/en_US.ISO8859-1/books/porters-handbook/book.sgml
===================================================================
RCS file: /home/ncvs/doc/en_US.ISO8859-1/books/porters-handbook/book.sgml,v
retrieving revision 1.470
diff -u -r1.470 book.sgml
--- doc/en_US.ISO8859-1/books/porters-handbook/book.sgml	5 Aug 2004 10:04:30 -0000	1.470
+++ doc/en_US.ISO8859-1/books/porters-handbook/book.sgml	6 Aug 2004 15:40:46 -0000
@@ -18,6 +18,8 @@
 %mailing-lists;
 <!ENTITY % freebsd PUBLIC "-//FreeBSD//ENTITIES DocBook Miscellaneous FreeBSD Entities//EN">
 %freebsd;
+<!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark Entities//EN">
+%trademarks;
 <!ENTITY % urls PUBLIC "-//FreeBSD//ENTITIES DocBook URL Entities//EN">
 %urls;
 ]>
@@ -682,6 +684,25 @@
 	  lines!); define <literal>USE_AUTOCONF_VER=213</literal> and take the
 	  diffs of <filename>configure.in</filename>.</para>
 
+	<para>Quite often, there is a situation when the software being
+	  ported, especially if it is primarily developed on &windows;, uses
+	  the CR/LF convention for most of its source files.  This may cause
+	  problems with further patching, compiler warnings, scripts
+	  execution (<command>/bin/sh^M</command> not found), etc.  To
+	  quickly convert those files from CR/LF to just LF, you can do
+	  something like this:</para>
+
+	<programlisting>USE_REINPLACE=	yes
+
+post-extract:
+	@${FIND} -E ${WRKDIR} -type f -iregex ".*\.(c|cpp|h|txt)" -print0 | \
+		${XARGS} -0 ${REINPLACE_CMD} -e 's/[[:cntrl:]]*$$//' '{}' \;</programlisting>
+
+	<para>Of course, if you need to process each and every file,
+	  <option>-iregex</option> above can be omitted.  Be aware that this
+	  piece of code will strip all trailing control characters from each
+	  line of processed file (except <literal>\n</literal>).</para>
+
 	<para>Also, if you had to delete a file, then you can do it in the
 	  <maketarget>post-extract</maketarget> target rather than as part of
 	  the patch.  Once you are happy with the resulting diff, please split

-- 
Peter Pentchev	roam@ringlet.net    roam@cnsys.bg    roam@FreeBSD.org
PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
I had to translate this sentence into English because I could not read the original Sanskrit.
Comment 5 danfe 2004-08-09 04:05:01 UTC
On Fri, Aug 06, 2004 at 04:43:47PM +0300, Peter Pentchev wrote:
> On Thu, Jul 15, 2004 at 02:01:05PM +0700, Alexey Dokuchaev wrote:
> > 
> > >Number:         69086
> > >Category:       docs
> > >Synopsis:       Porters Handbook: How to convert from CR/LF to LF using REINPLACE_CMD
> 
> What do you think about the following patch, which advocates a bit more
> efficient method (find/xargs will invoke REINPLACE_CMD an order of
> magnitude less than invoking it for each and every file, or even on many
> files in succession :), and also has some minor corrections and
> rewording to the text above?

Agreed, thanks for your improvements.

./danfe
Comment 6 danfe 2004-08-09 04:05:28 UTC
On Fri, Aug 06, 2004 at 06:45:14PM +0300, Peter Pentchev wrote:
> On Fri, Aug 06, 2004 at 04:43:47PM +0300, Peter Pentchev wrote:
> > On Thu, Jul 15, 2004 at 02:01:05PM +0700, Alexey Dokuchaev wrote:
> > > 
> > > >Number:         69086
> > > >Category:       docs
> > > >Synopsis:       Porters Handbook: How to convert from CR/LF to LF using REINPLACE_CMD
> > 
> > What do you think about the following patch, which advocates a bit more
> > efficient method (find/xargs will invoke REINPLACE_CMD an order of
> > magnitude less than invoking it for each and every file, or even on many
> > files in succession :), and also has some minor corrections and
> > rewording to the text above?
> 
> Or how about the following, which uses the &windows; entity properly? :)

This also looks good.

./danfe
Comment 7 Peter Pentchev freebsd_committer freebsd_triage 2004-08-09 12:51:43 UTC
State Changed
From-To: open->closed

Committed with slight modifications.  Thanks for the patch!