Plan 9 from Bell Labs’s /usr/web/sources/patch/sorry/long-regexps/readme

Copyright © 2021 Plan 9 Foundation.
Distributed under the MIT License.
Download the Plan 9 distribution.



fix the reported "bug in awk" where long regexps are not evaluated correctly.

the same problem exists in both native and ape versions of regexp,
character classes are restricted in size - perhaps unrealisticly to 64
characters (Runes/wchar_ts). In these days of Gbytes I went to the opposite
extreme and allowed 256 characters.

test with awk (for ape) using:

	echo hello | 8.out  '/[abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ]*/ { print "hit" }'

for native I used this:

#include <u.h>
#include <libc.h>
#include <regexp.h>

extern void dump(Reprog *);

void
main(int argc, char *argv[])
{
	Reprog *re1, *re2;

	re1 = regcomp("[abcdefghijklmnopqrstuvwxyz]*");
	re2 = regcomp("[abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ]*");

	if(regexec(re1, "hello", 0, 0))
		print("match short expression\n");

	if(regexec(re2, "hello", 0, 0))
		print("match long expression\n");

}


-Steve

Bell Labs OSI certified Powered by Plan 9

(Return to Plan 9 Home Page)

Copyright © 2021 Plan 9 Foundation. All Rights Reserved.
Comments to [email protected].