Bug 194466 - [patch] Update printf(1) to support %T and %q bashisms
Summary: [patch] Update printf(1) to support %T and %q bashisms
Status: Closed Not Accepted
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 9.3-PRERELEASE
Hardware: Any Any
: Normal Affects Many People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2014-10-19 12:53 UTC by Mikhail T.
Modified: 2019-01-08 18:50 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mikhail T. 2014-10-19 12:53:40 UTC
The printf-implementation built into recent bash offers a very useful
	feature:

              %(datefmt)T
                     causes  printf  to  output the date-time string resulting
                     from using datefmt as a format  string  for strftime(3).
                     The corresponding argument is an integer representing the
                     number of seconds since the epoch.  Two special argument
                     values  may  be used: -1 represents the current time, and
                     -2 represents the time the shell was invoked.

	The feature makes it easier for script to produce time-stamped logs --
	it can now be done without spawning a separate date(1) process for each
	log-entry.

	The enclosed patch updates our printf to support it. My patch one-ups
	bash-implementation by making the format string optional -- if simple
	%T is used, the "%a %b %d %T %Z %Y" is used by default (same as
	date(1)).

	My implementation also provides for future additions of other
	format-characters, which may need their own (sub)formats. For example,
	some day we may extend the built-in printf to print information about
	the process itself and any of the jobs -- using the same
	format-flexibility as ps(1):

		printf %(%cpu %mem)P -1

	The patch also implements the %q feature of bash's printf to help
	escape special characters -- for completeness.

	While here, I also update the oft-used PF-macro to simply use
	printf(3), instead of the asprintf->fputs->free sequence.
	This hunk is separate from the new functionality, though, and
	can be omitted, if the current code is correct.

Environment:
System: FreeBSD narawntapu 9.3-PRERELEASE

Fix:
Index: printf.1
===================================================================
--- printf.1	(revision 273142)
+++ printf.1	(working copy)
@@ -290,6 +290,29 @@
 .Cm \e0 Ns Ar num
 instead of
 .Cm \e Ns Ar num .
+.It Cm q
+As for
+.Cm s ,
+but with certain characters (such as braces, blanks, parentheses and the like)
+escaped. This makes the result (re)usable as shell argument or input.
+.It Cm [(format)]T
+Interpret the argument as the number of seconds since Unix epoch and turn
+it into a timestamp according to
+.Ar format .
+If format is omitted, the string "%a %b %d %T %Z %Y" will be used -- this
+is, what date(1) does by default. The specified format is passed to strftime(3).
+As with
+.Nm bash ,
+two special values for seconds-value are recognized to mean:
+.Bl -tag -width Ds
+.It -1
+current time. This allows scripts to log messages with timestamps without
+executing date(1) for each message.
+.It -2
+time, when the current sh-process was started. When using the standalone
+.Nm
+(rather than the sh-builtin), -2 and -1 are identical.
+.El
 .It Cm \&%
 Print a `%'; no argument is used.
 .El
@@ -322,6 +345,7 @@
 .Xr echo 1 ,
 .Xr sh 1 ,
 .Xr printf 3
+.Xr strftime 3
 .Sh STANDARDS
 The
 .Nm
Index: printf.c
===================================================================
--- printf.c	(revision 273142)
+++ printf.c	(working copy)
@@ -47,7 +47,12 @@
   "$FreeBSD$";
 #endif /* not lint */
 
+#include <sys/time.h>
 #include <sys/types.h>
+#ifdef SHELL
+#include <sys/sysctl.h>
+#include <sys/user.h>
+#endif
 
 #include <err.h>
 #include <errno.h>
@@ -67,20 +72,15 @@
 #endif
 
 #define	PF(f, func) do {						\
-	char *b = NULL;							\
 	if (havewidth)							\
 		if (haveprec)						\
-			(void)asprintf(&b, f, fieldwidth, precision, func); \
+			(void)printf(f, fieldwidth, precision, func);	\
 		else							\
-			(void)asprintf(&b, f, fieldwidth, func);	\
+			(void)printf(f, fieldwidth, func);		\
 	else if (haveprec)						\
-		(void)asprintf(&b, f, precision, func);			\
+		(void)printf(f, precision, func);			\
 	else								\
-		(void)asprintf(&b, f, func);				\
-	if (b) {							\
-		(void)fputs(b, stdout);					\
-		free(b);						\
-	}								\
+		(void)printf(f, func);					\
 } while (0)
 
 static int	 asciicode(void);
@@ -93,6 +93,9 @@
 static const char
 		*getstr(void);
 static char	*mknum(char *, char);
+#if defined(SHELL)
+static void	 getstarttime(struct timeval *ptv);
+#endif
 static void	 usage(void);
 
 static char **gargv;
@@ -191,7 +194,10 @@
 {
 	static const char skip1[] = "#'-+ 0";
 	static const char skip2[] = "0123456789";
+	static const char needescaping[] = "\n\t !\"$&'()*,;<>?[\\]^`{|}";
+
 	char *fmt;
+	static char *subfmt;
 	int fieldwidth, haveprec, havewidth, mod_ldbl, precision;
 	char convch, nextch;
 
@@ -281,6 +287,25 @@
 		PF(start, p);
 		break;
 	}
+	case 'q': {
+		const char *s, *p;
+
+		s = getstr();
+		/*
+		 * #-signs only need escaping, if found at the beginning.
+		 * ~ is only escaped at the beginning or after ':' or '='
+		 * Other special characters are always escaped:
+		 */
+		for (p = s; *p != '\0'; p++) {
+			if ((*p == '#' && p == s) ||
+			    (*p == '~' &&
+				(p == s || p[-1] == ':' || p[-1] == '=')) ||
+			    index(needescaping, *p)) {
+				putchar('\\');
+			}
+			putchar(*p);
+		}
+	}
 	case 's': {
 		const char *p;
 
@@ -319,6 +344,98 @@
 			PF(start, (double)p);
 		break;
 	}
+	case '(': {
+		char	*end = index(fmt + 1, ')');
+
+		if (end == NULL) {
+			warnx("No closing ')' detected");
+			return (NULL);
+		}
+			
+		if (subfmt != NULL) {
+			warnx("Subformat \"%s\" already set", subfmt);
+			return NULL;
+		}
+		if (end == fmt)
+			break; /* Empty format -- presume default */
+		asprintf(&subfmt, "%c%.*s", nextch,
+		    (int)(end - fmt - 1), fmt + 1);
+		if (subfmt == NULL) {
+			warnx("%s", strerror(ENOMEM));
+			return (NULL);
+		}
+		*fmt = nextch;
+		*end = '%';
+		return (end);
+	}
+	case 'T': {
+		char		*timebuf = NULL;
+		size_t		 buflen, timelen;
+		const char	*timefmt;
+		intmax_t	 val;
+		uintmax_t	 uval;
+		struct timeval	 tv;
+
+		if (subfmt == NULL)
+			timefmt = "%a %b %d %T %Z %Y"; /* date(1)'s default */
+		else
+			timefmt = subfmt;
+		if (getnum(&val, &uval, 1) || (tv.tv_sec = val) != val) {
+			if (subfmt != NULL) {
+				free(subfmt);
+				subfmt = NULL;
+			}
+			*rval = 1;
+			return NULL;
+		}
+		switch (tv.tv_sec) {
+		case -2:
+#if defined(SHELL)
+			getstarttime(&tv);
+			break;
+		/* Outside of shell, -2 and -1 both mean current time */
+#endif
+		case -1:
+			gettimeofday(&tv, NULL);
+			break;
+		default:
+			if (tv.tv_sec < 0) {
+				warnx("Can not turn %jd into date/time", val);
+				if (subfmt != NULL) {
+					free(subfmt);
+					subfmt = NULL;
+				}
+				*rval = 1;
+				return NULL;
+			}
+		}
+		/*
+		 * No (easy) way to know, how big the buffer needs to be.
+		 * Start small and double as necessary:
+		 */
+		buflen = strlen(timefmt) * 4; /* harmless overestimate */
+		do {
+			buflen *= 2;
+			timebuf = realloc(timebuf, buflen);
+			if (timebuf == NULL) {
+				if (subfmt != NULL) {
+					free(subfmt);
+					subfmt = NULL;
+				}
+				warnx("%s", strerror(ENOMEM));
+				return (NULL);
+			}
+			timelen = strftime(timebuf, buflen, timefmt,
+			    localtime(&tv.tv_sec));
+		} while(timelen == 0);
+		fputs(timebuf, stdout);
+		free(timebuf);
+		if (subfmt != NULL) {
+			free(subfmt);
+			subfmt = NULL;
+		}
+		break;
+	}
 	default:
 		warnx("illegal format character %c", convch);
 		return (NULL);
@@ -559,7 +676,43 @@
 	return (ch);
 }
 
+#ifdef SHELL
 static void
+getstarttime(struct timeval *ptv)
+{
+	static time_t		start; /* We cache this value */
+	int			mib[4];
+	size_t			len;
+	struct kinfo_proc	kp;
+
+	ptv->tv_sec = 0;
+	ptv->tv_usec = 0;	/* Not using microseconds at all */
+
+	if (start != 0) {
+		ptv->tv_sec = start;
+		return;
+	}
+
+	len = 4;
+	if (sysctlnametomib("kern.proc.pid", mib, &len) != 0) {
+		warn("Unable to obtain script start-time: %s",
+		    "sysctlnametomib");
+		return;
+	}
+	mib[3] = getpid();
+	len = sizeof(kp);
+	if (sysctl(mib, 4, &kp, &len, NULL, 0) != 0) {
+		warn("Unable to obtain script start-time: %s",
+		    "sysctl");
+		return;
+	}
+	/* XXX Should we check, whether len still equals sizeof(kinfo_proc)? */
+
+	start = ptv->tv_sec = kp.ki_start.tv_sec;
+}
+#endif
+
+static void
 usage(void)
 {
 	(void)fprintf(stderr, "usage: printf format [arguments ...]\n");
Comment 1 Baptiste Daroussin freebsd_committer freebsd_triage 2014-11-04 23:33:19 UTC
Nice!

Can you also add regression tests about those (we now have regression tests)?
Comment 2 Mikhail T. 2014-11-04 23:55:11 UTC
(In reply to Baptiste Daroussin from comment #1)
> Can you also add regression tests about those (we now have regression tests)?

Given the skepticism expressed by jilles@ in a private e-mail earlier, probably not... Unless he changes his mind, this change seems unlikely to be committed -- regardless of whether regression tests for the new functionality are provided.
Comment 3 Pedro F. Giffuni freebsd_committer freebsd_triage 2014-11-05 00:55:33 UTC
FWIW;

I also forwarded the patch to the illumos developer that has been doing work on our shared printf(1) and he was not interested in the %T option either.

The %q does have some chance though.
Comment 4 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-01-08 18:50:21 UTC
Although available in GNU, the changes are non-standard.
Neither illumos nor us seem interested in the feature.