implicit Unicode data tables generation

Message ID d391c404-3bf7-d6a1-4313-96e1bde16009@towo.net
State New
Headers show
Series
  • implicit Unicode data tables generation
Related show

Commit Message

Thomas Wolff March 7, 2018, 11:27 p.m.
This patch adds rules to generate Unicode data tables for
libc/string/wcwidth, libc/ctype/tow* and libc/ctype/isw* implicitly,
however, the build process is not yet covered.
The rules define how the necessary data tables are built,
but additional dependencies are missing to get the rules actually invoked,
like:
(string/Makefile.am)
wcwidth.c:    $(srcdir)/ambiguous.t $(srcdir)/combining.t $(srcdir)/wide.t
(ctype/Makefile.am)
categories.c:   $(srcdir)/categories.t
towctrans_l.c:  $(srcdir)/caseconv.t
I tried some variations (like $(srcdir)/wcwidth.c:..., lib_a-wcwidth.o:...,
and even an additional dummy wcwidth.h) but it did not work;
I do not know how to specify an additional target for lib_a-wcwidth.o
(make would say “overriding recipe” and “ignoring old recipe”)
and I have no idea how to fiddle that into Makefile.am anyway,
so if anyone wants to advocate for the implicit approach, please
add that part.
From 4b2a7314ec85cd714910cb04ffce92d1cb203054 Mon Sep 17 00:00:00 2001
From: Thomas Wolff <towo@towo.net>
Date: Thu, 8 Mar 2018 00:10:41 +0100
Subject: [PATCH] make rules for implicit Unicode data tables generation

Dependency rules to generate Unicode data tables
for libc functions wcwidth, tow* and isw*.
---
 newlib/libc/ctype/Makefile.am  | 20 ++++++++++++++++++++
 newlib/libc/ctype/Makefile.in  | 20 ++++++++++++++++++++
 newlib/libc/string/Makefile.am | 24 ++++++++++++++++++++++++
 newlib/libc/string/Makefile.in | 24 ++++++++++++++++++++++++
 4 files changed, 88 insertions(+)

Comments

Corinna Vinschen March 8, 2018, 8:07 a.m. | #1
On Mar  8 00:27, Thomas Wolff wrote:
> This patch adds rules to generate Unicode data tables for

> libc/string/wcwidth, libc/ctype/tow* and libc/ctype/isw* implicitly,

> however, the build process is not yet covered.

> The rules define how the necessary data tables are built,

> but additional dependencies are missing to get the rules actually invoked,

> like:

> (string/Makefile.am)

> wcwidth.c:    $(srcdir)/ambiguous.t $(srcdir)/combining.t $(srcdir)/wide.t

> (ctype/Makefile.am)

> categories.c:   $(srcdir)/categories.t

> towctrans_l.c:  $(srcdir)/caseconv.t

> I tried some variations (like $(srcdir)/wcwidth.c:..., lib_a-wcwidth.o:...,

> and even an additional dummy wcwidth.h) but it did not work;


Look at ctype/Makefile.am, last line:

$(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

This should work for this case here, too.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Thomas Wolff March 8, 2018, 10:09 p.m. | #2
Am 08.03.2018 um 09:07 schrieb Corinna Vinschen:
> On Mar  8 00:27, Thomas Wolff wrote:

>> This patch adds rules to generate Unicode data tables for

>> libc/string/wcwidth, libc/ctype/tow* and libc/ctype/isw* implicitly,

>> however, the build process is not yet covered.

>> The rules define how the necessary data tables are built,

>> but additional dependencies are missing to get the rules actually invoked,

>> like:

>> (string/Makefile.am)

>> wcwidth.c:    $(srcdir)/ambiguous.t $(srcdir)/combining.t $(srcdir)/wide.t

>> (ctype/Makefile.am)

>> categories.c:   $(srcdir)/categories.t

>> towctrans_l.c:  $(srcdir)/caseconv.t

>> I tried some variations (like $(srcdir)/wcwidth.c:..., lib_a-wcwidth.o:...,

>> and even an additional dummy wcwidth.h) but it did not work;

> Look at ctype/Makefile.am, last line:

>

> $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

>

> This should work for this case here, too.

Thanks, the following rules actually trigger the dependencies:
(string)
$(lpfx)wcwidth.$(oext): wcwidth.c ambiguous.t combining.t wide.t
(ctype)
$(lpfx)categories.$(oext): categories.c categories.t
$(lpfx)towctrans_l.$(oext): towctrans_l.c caseconv.t

but they cause obscure compilation errors:
In file included from /usr/include/sys/config.h:234:0,
                  from /usr/include/_ansi.h:11,
                  from ../../../.././newlib/libc/string/wcwidth.c:91:
/usr/include/cygwin/config.h:40:29: fatal error: ../tlsoffsets64.h: No 
such file or directory
  #include "../tlsoffsets64.h"
                              ^
(Maybe it's just one of these volatile errors that occasionally occur 
when building cygwin?)
I'll try on another system, or else just provide the patch for Brian 
(who requested the implicit approach) to work this out...
Thomas


---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus
Thomas Wolff March 9, 2018, 7:46 a.m. | #3
Am 08.03.2018 um 23:09 schrieb Thomas Wolff:
> Am 08.03.2018 um 09:07 schrieb Corinna Vinschen:

>> On Mar  8 00:27, Thomas Wolff wrote:

>>> This patch adds rules to generate Unicode data tables for

>>> libc/string/wcwidth, libc/ctype/tow* and libc/ctype/isw* implicitly,

>>> however, the build process is not yet covered.

>>> The rules define how the necessary data tables are built,

>>> but additional dependencies are missing to get the rules actually 

>>> invoked,

>>> like:

>>> (string/Makefile.am)

>>> wcwidth.c:    $(srcdir)/ambiguous.t $(srcdir)/combining.t 

>>> $(srcdir)/wide.t

>>> (ctype/Makefile.am)

>>> categories.c:   $(srcdir)/categories.t

>>> towctrans_l.c:  $(srcdir)/caseconv.t

>>> I tried some variations (like $(srcdir)/wcwidth.c:..., 

>>> lib_a-wcwidth.o:...,

>>> and even an additional dummy wcwidth.h) but it did not work;

>> Look at ctype/Makefile.am, last line:

>>

>> $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

>>

>> This should work for this case here, too.

> Thanks, the following rules actually trigger the dependencies:

> (string)

> $(lpfx)wcwidth.$(oext): wcwidth.c ambiguous.t combining.t wide.t

> (ctype)

> $(lpfx)categories.$(oext): categories.c categories.t

> $(lpfx)towctrans_l.$(oext): towctrans_l.c caseconv.t

>

> but they cause obscure compilation errors:

> In file included from /usr/include/sys/config.h:234:0,

>                  from /usr/include/_ansi.h:11,

>                  from ../../../.././newlib/libc/string/wcwidth.c:91:

> /usr/include/cygwin/config.h:40:29: fatal error: ../tlsoffsets64.h: No 

> such file or directory

>  #include "../tlsoffsets64.h"

>                              ^

> (Maybe it's just one of these volatile errors that occasionally occur 

> when building cygwin?)

> I'll try on another system, or else just provide the patch for Brian 

> (who requested the implicit approach) to work this out...

> Thomas

Replacement patch for further analysis attached.
From 4b2a7314ec85cd714910cb04ffce92d1cb203054 Mon Sep 17 00:00:00 2001
From: Thomas Wolff <towo@towo.net>
Date: Thu, 8 Mar 2018 00:10:41 +0100
Subject: [PATCH] make rules for implicit Unicode data tables generation

Dependency rules to generate Unicode data tables
for libc functions wcwidth, tow* and isw*.
---
 newlib/libc/ctype/Makefile.am  | 20 ++++++++++++++++++++
 newlib/libc/ctype/Makefile.in  | 20 ++++++++++++++++++++
 newlib/libc/string/Makefile.am | 24 ++++++++++++++++++++++++
 newlib/libc/string/Makefile.in | 24 ++++++++++++++++++++++++
 4 files changed, 88 insertions(+)

diff --git a/newlib/libc/ctype/Makefile.am b/newlib/libc/ctype/Makefile.am
index fa6a70d..714b333 100644
--- a/newlib/libc/ctype/Makefile.am
+++ b/newlib/libc/ctype/Makefile.am
@@ -135,3 +135,23 @@ CHEWOUT_FILES= \
 CHAPTERS = ctype.tex
 
 $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# case conversion and character category data for libc/ctype/??w*.c
+
+$(lpfx)categories.$(oext): categories.c categories.t
+
+$(lpfx)towctrans_l.$(oext): towctrans_l.c caseconv.t
+
+$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcategories
+
+$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcaseconv
+
diff --git a/newlib/libc/ctype/Makefile.in b/newlib/libc/ctype/Makefile.in
index 9932a94..ffcb384 100644
--- a/newlib/libc/ctype/Makefile.in
+++ b/newlib/libc/ctype/Makefile.in
@@ -1158,3 +1158,23 @@ $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# case conversion and character category data for libc/ctype/??w*.c
+
+$(lpfx)categories.$(oext): categories.c categories.t
+
+$(lpfx)towctrans_l.$(oext): towctrans_l.c caseconv.t
+
+$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcategories
+
+$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcaseconv
+
diff --git a/newlib/libc/string/Makefile.am b/newlib/libc/string/Makefile.am
index 49de080..2617112 100644
--- a/newlib/libc/string/Makefile.am
+++ b/newlib/libc/string/Makefile.am
@@ -168,3 +168,27 @@ wcscasecmp_l.def wcscoll_l.def	wcsncasecmp_l.def wcsxfrm_l.def \
 strverscmp.def	strnstr.def	wmempcpy.def
 
 CHAPTERS = strings.tex wcstrings.tex
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# width data for libc/string/wcwidth.c
+
+$(lpfx)wcwidth.$(oext): wcwidth.c ambiguous.t combining.t wide.t
+
+$(srcdir)/combining.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +cat=Me +cat=Mn +cat=Cf -00AD +1160-11FF +200B +D7B0-D7C6 +D7CB-D7FB c > combining.t
+
+$(srcdir)/WIDTH-A:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwidthA
+
+$(srcdir)/ambiguous.t:	$(srcdir)/WIDTH-A $(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +WIDTH-A -cat=Me -cat=Mn -cat=Cf c > ambiguous.t
+
+$(srcdir)/wide.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwide
+
diff --git a/newlib/libc/string/Makefile.in b/newlib/libc/string/Makefile.in
index eb8fafc..3758f71 100644
--- a/newlib/libc/string/Makefile.in
+++ b/newlib/libc/string/Makefile.in
@@ -1416,3 +1416,27 @@ docbook: $(DOCBOOK_OUT_FILES)
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# width data for libc/string/wcwidth.c
+
+$(lpfx)wcwidth.$(oext): wcwidth.c ambiguous.t combining.t wide.t
+
+$(srcdir)/combining.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +cat=Me +cat=Mn +cat=Cf -00AD +1160-11FF +200B +D7B0-D7C6 +D7CB-D7FB c > combining.t
+
+$(srcdir)/WIDTH-A:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwidthA
+
+$(srcdir)/ambiguous.t:	$(srcdir)/WIDTH-A $(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +WIDTH-A -cat=Me -cat=Mn -cat=Cf c > ambiguous.t
+
+$(srcdir)/wide.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwide
+
Corinna Vinschen March 9, 2018, 9:17 a.m. | #4
On Mar  8 23:09, Thomas Wolff wrote:
> Am 08.03.2018 um 09:07 schrieb Corinna Vinschen:

> > On Mar  8 00:27, Thomas Wolff wrote:

> > > This patch adds rules to generate Unicode data tables for

> > > libc/string/wcwidth, libc/ctype/tow* and libc/ctype/isw* implicitly,

> > > however, the build process is not yet covered.

> > > The rules define how the necessary data tables are built,

> > > but additional dependencies are missing to get the rules actually invoked,

> > > like:

> > > (string/Makefile.am)

> > > wcwidth.c:    $(srcdir)/ambiguous.t $(srcdir)/combining.t $(srcdir)/wide.t

> > > (ctype/Makefile.am)

> > > categories.c:   $(srcdir)/categories.t

> > > towctrans_l.c:  $(srcdir)/caseconv.t

> > > I tried some variations (like $(srcdir)/wcwidth.c:..., lib_a-wcwidth.o:...,

> > > and even an additional dummy wcwidth.h) but it did not work;

> > Look at ctype/Makefile.am, last line:

> > 

> > $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

> > 

> > This should work for this case here, too.

> Thanks, the following rules actually trigger the dependencies:

> (string)

> $(lpfx)wcwidth.$(oext): wcwidth.c ambiguous.t combining.t wide.t

> (ctype)

> $(lpfx)categories.$(oext): categories.c categories.t

> $(lpfx)towctrans_l.$(oext): towctrans_l.c caseconv.t

> 

> but they cause obscure compilation errors:

> In file included from /usr/include/sys/config.h:234:0,

>                  from /usr/include/_ansi.h:11,

>                  from ../../../.././newlib/libc/string/wcwidth.c:91:

> /usr/include/cygwin/config.h:40:29: fatal error: ../tlsoffsets64.h: No such

> file or directory

>  #include "../tlsoffsets64.h"

>                              ^

> (Maybe it's just one of these volatile errors that occasionally occur when

> building cygwin?)


I'm building Cygwin quite a lot, often multiple times a day, for the
last 20 years.  I'm not aware of any "volatile errors" when building
Cygwin.  However, if you change the build system, you often have to
rebuild from scratch.  Did you try that?

The above error message show that you're doing something weird when
building Cygwin.  Look at the paths.  They point to your system's
include paths, i.e.

  /usr/include/sys/config.h

That's fishy.  The include paths should all point inside the source
tree, i.e.

  -I/home/corinna/src/cygwin/vanilla/winsup/cygwin/include

etc.  The fact that the commpiler picks up your system paths indicate
that you don't build correctly.

Are you building inside the source tree?  That's not supported and
may actually lead to some weird build errors.

https://cygwin.com/faq/faq.html#faq.programming.building-cygwin


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Thomas Wolff March 9, 2018, 10:50 p.m. | #5
Am 09.03.2018 um 08:46 schrieb Thomas Wolff:
> Am 08.03.2018 um 23:09 schrieb Thomas Wolff:

>> Am 08.03.2018 um 09:07 schrieb Corinna Vinschen:

>>> On Mar  8 00:27, Thomas Wolff wrote:

>>>> This patch adds rules to generate Unicode data tables for

>>>> libc/string/wcwidth, libc/ctype/tow* and libc/ctype/isw* implicitly,

>>>> however, the build process is not yet covered.

>>>> The rules define how the necessary data tables are built,

>>>> but additional dependencies are missing to get the rules actually 

>>>> invoked,

>>>> like:

>>>> (string/Makefile.am)

>>>> wcwidth.c:    $(srcdir)/ambiguous.t $(srcdir)/combining.t 

>>>> $(srcdir)/wide.t

>>>> (ctype/Makefile.am)

>>>> categories.c:   $(srcdir)/categories.t

>>>> towctrans_l.c:  $(srcdir)/caseconv.t

>>>> I tried some variations (like $(srcdir)/wcwidth.c:..., 

>>>> lib_a-wcwidth.o:...,

>>>> and even an additional dummy wcwidth.h) but it did not work;

>>> Look at ctype/Makefile.am, last line:

>>>

>>> $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

>>>

>>> This should work for this case here, too.

>> Thanks, the following rules actually trigger the dependencies:

>> (string)

>> $(lpfx)wcwidth.$(oext): wcwidth.c ambiguous.t combining.t wide.t

>> (ctype)

>> $(lpfx)categories.$(oext): categories.c categories.t

>> $(lpfx)towctrans_l.$(oext): towctrans_l.c caseconv.t

>>

>> but they cause obscure compilation errors:

>> ...

>> I'll try on another system, or else just provide the patch for Brian 

>> (who requested the implicit approach) to work this out...

>> Thomas

> Replacement patch for further analysis attached.

OK, this patch as attached to my previous mail apparently works, 
verified after rebuild:
0001-implicit-Unicode-data-tables-generation.patch

Thomas


---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus
Corinna Vinschen March 12, 2018, 1:35 p.m. | #6
On Mar  9 08:46, Thomas Wolff wrote:
> diff --git a/newlib/libc/ctype/Makefile.am b/newlib/libc/ctype/Makefile.am

> index fa6a70d..714b333 100644

> --- a/newlib/libc/ctype/Makefile.am

> +++ b/newlib/libc/ctype/Makefile.am

> @@ -135,3 +135,23 @@ CHEWOUT_FILES= \

>  CHAPTERS = ctype.tex

>  

>  $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

> +

> +#############################################################################

> +# Unicode data

> +

> +$(srcdir)/%.txt:

> +	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .


This is a no-no.  Do not create links into the OS tree.  You don't
even know if the file exists.

> +$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt

> +	cd $(srcdir); sh ./mkcategories

> +

> +$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt

> +	cd $(srcdir); sh ./mkcaseconv

> +


Consequently, these rules are broken.   Consider that somebody might
build from git without manually installing UnicodeData.txt.  In that
case the above rules lead to a build error, along the lines of

  UnicodeData.txt not found
  Error 1
  ...

It would be helpful to create rules which just skip the dependency
if UnicodeData.txt doesn't exist in the source tree.  Unfortunately,
the only way to do that off the top of my head is to make the *.t files
depend on a phony target which then does everything in shell.  Better
ideas highly appreciated.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Yaakov Selkowitz March 12, 2018, 1:41 p.m. | #7
On 2018-03-12 08:35, Corinna Vinschen wrote:
> On Mar  9 08:46, Thomas Wolff wrote:

>> diff --git a/newlib/libc/ctype/Makefile.am b/newlib/libc/ctype/Makefile.am

>> index fa6a70d..714b333 100644

>> --- a/newlib/libc/ctype/Makefile.am

>> +++ b/newlib/libc/ctype/Makefile.am

>> @@ -135,3 +135,23 @@ CHEWOUT_FILES= \

>>  CHAPTERS = ctype.tex

>>  

>>  $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

>> +

>> +#############################################################################

>> +# Unicode data

>> +

>> +$(srcdir)/%.txt:

>> +	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .

> 

> This is a no-no.  Do not create links into the OS tree.  You don't

> even know if the file exists.

> 

>> +$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt

>> +	cd $(srcdir); sh ./mkcategories

>> +

>> +$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt

>> +	cd $(srcdir); sh ./mkcaseconv

>> +

> 

> Consequently, these rules are broken.   Consider that somebody might

> build from git without manually installing UnicodeData.txt.  In that

> case the above rules lead to a build error, along the lines of

> 

>   UnicodeData.txt not found

>   Error 1

>   ...

> 

> It would be helpful to create rules which just skip the dependency

> if UnicodeData.txt doesn't exist in the source tree.  Unfortunately,

> the only way to do that off the top of my head is to make the *.t files

> depend on a phony target which then does everything in shell.  Better

> ideas highly appreciated.


I think either unicode-ucd is a required build dependency, or otherwise
these rules would have to be explicit only and keep the generated files
in git for the benefit of those without it.

-- 
Yaakov
Corinna Vinschen March 12, 2018, 2:21 p.m. | #8
On Mar 12 08:41, Yaakov Selkowitz wrote:
> On 2018-03-12 08:35, Corinna Vinschen wrote:

> > On Mar  9 08:46, Thomas Wolff wrote:

> >> diff --git a/newlib/libc/ctype/Makefile.am b/newlib/libc/ctype/Makefile.am

> >> index fa6a70d..714b333 100644

> >> --- a/newlib/libc/ctype/Makefile.am

> >> +++ b/newlib/libc/ctype/Makefile.am

> >> @@ -135,3 +135,23 @@ CHEWOUT_FILES= \

> >>  CHAPTERS = ctype.tex

> >>  

> >>  $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

> >> +

> >> +#############################################################################

> >> +# Unicode data

> >> +

> >> +$(srcdir)/%.txt:

> >> +	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .

> > 

> > This is a no-no.  Do not create links into the OS tree.  You don't

> > even know if the file exists.

> > 

> >> +$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt

> >> +	cd $(srcdir); sh ./mkcategories

> >> +

> >> +$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt

> >> +	cd $(srcdir); sh ./mkcaseconv

> >> +

> > 

> > Consequently, these rules are broken.   Consider that somebody might

> > build from git without manually installing UnicodeData.txt.  In that

> > case the above rules lead to a build error, along the lines of

> > 

> >   UnicodeData.txt not found

> >   Error 1

> >   ...

> > 

> > It would be helpful to create rules which just skip the dependency

> > if UnicodeData.txt doesn't exist in the source tree.  Unfortunately,

> > the only way to do that off the top of my head is to make the *.t files

> > depend on a phony target which then does everything in shell.  Better

> > ideas highly appreciated.

> 

> I think either unicode-ucd is a required build dependency, or otherwise

> these rules would have to be explicit only and keep the generated files

> in git for the benefit of those without it.


A build dependency to unicode-ucd should not be required.  Why not have
automatic build rules which are simply skipped if the file doesn't
exist?  So somebody can download the file and the rules do the rest?
It's not *that* important, but a nice feature.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Thomas Wolff March 12, 2018, 7:32 p.m. | #9
Am 12.03.2018 um 15:21 schrieb Corinna Vinschen:
> On Mar 12 08:41, Yaakov Selkowitz wrote:

>> On 2018-03-12 08:35, Corinna Vinschen wrote:

>>> On Mar  9 08:46, Thomas Wolff wrote:

>>>> diff --git a/newlib/libc/ctype/Makefile.am b/newlib/libc/ctype/Makefile.am

>>>> index fa6a70d..714b333 100644

>>>> --- a/newlib/libc/ctype/Makefile.am

>>>> +++ b/newlib/libc/ctype/Makefile.am

>>>> @@ -135,3 +135,23 @@ CHEWOUT_FILES= \

>>>>   CHAPTERS = ctype.tex

>>>>   

>>>>   $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h

>>>> +

>>>> +#############################################################################

>>>> +# Unicode data

>>>> +

>>>> +$(srcdir)/%.txt:

>>>> +	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .

>>> This is a no-no.  Do not create links into the OS tree.  You don't

>>> even know if the file exists.

>>>

>>>> +$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt

>>>> +	cd $(srcdir); sh ./mkcategories

>>>> +

>>>> +$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt

>>>> +	cd $(srcdir); sh ./mkcaseconv

>>>> +

>>> Consequently, these rules are broken.   Consider that somebody might

>>> build from git without manually installing UnicodeData.txt.  In that

>>> case the above rules lead to a build error, along the lines of

>>>

>>>    UnicodeData.txt not found

>>>    Error 1

>>>    ...

>>>

>>> It would be helpful to create rules which just skip the dependency

>>> if UnicodeData.txt doesn't exist in the source tree.  Unfortunately,

>>> the only way to do that off the top of my head is to make the *.t files

>>> depend on a phony target which then does everything in shell.  Better

>>> ideas highly appreciated.

That's why I would have appreciated to go with the explicit generation 
only, as you had previously requested.

>> I think either unicode-ucd is a required build dependency, or otherwise

>> these rules would have to be explicit only and keep the generated files

>> in git for the benefit of those without it.

Yes, I forgot to suggest that unicode-ucd should be listed at 
https://cygwin.com/[faq/]faq.html#faq.programming.building-cygwin.
If that's not desired, on the other hand...
> A build dependency to unicode-ucd should not be required.  Why not have

> automatic build rules which are simply skipped if the file doesn't

> exist?  So somebody can download the file and the rules do the rest?

> It's not *that* important, but a nice feature.

The easiest way would be that the mkunidata scripts just ignore the 
missing files, like in the patch attached.
On the other hand, in my original scripts I had the additional fallback 
option to download them from unicode.org on demand.
If that's acceptable in a build process, I could add that back in.
Or just drop those rules and stay explicit?
Thomas
From bac4388880e0348c8dac7fe9f133606af5f6c29d Mon Sep 17 00:00:00 2001
From: Thomas Wolff <towo@towo.net>
Date: Mon, 12 Mar 2018 20:24:22 +0100
Subject: [PATCH] drop strict dependency on Unicode data in unicode-ucd package

---
 newlib/libc/ctype/mkunidata  | 2 +-
 newlib/libc/string/mkunidata | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/newlib/libc/ctype/mkunidata b/newlib/libc/ctype/mkunidata
index ea18e67..05a7043 100755
--- a/newlib/libc/ctype/mkunidata
+++ b/newlib/libc/ctype/mkunidata
@@ -24,7 +24,7 @@ case "$1" in
 esac
 
 for data in UnicodeData.txt
-do	test -r $data || ln -s /usr/share/unicode/ucd/$data . || exit 9
+do	test -r $data || ln -s /usr/share/unicode/ucd/$data . || exit 0
 done
 
 #############################################################################
diff --git a/newlib/libc/string/mkunidata b/newlib/libc/string/mkunidata
index c0bf5de..02735b5 100755
--- a/newlib/libc/string/mkunidata
+++ b/newlib/libc/string/mkunidata
@@ -24,7 +24,7 @@ case "$1" in
 	done
 	;;
 *)	echo checking package unicode-ucd
-	grep unicode-ucd /etc/setup/installed.db || exit 9
+	grep unicode-ucd /etc/setup/installed.db || exit 0
 	;;
 esac
Corinna Vinschen March 13, 2018, 10:46 a.m. | #10
On Mar 12 20:32, Thomas Wolff wrote:
> Am 12.03.2018 um 15:21 schrieb Corinna Vinschen:

> > A build dependency to unicode-ucd should not be required.  Why not have

> > automatic build rules which are simply skipped if the file doesn't

> > exist?  So somebody can download the file and the rules do the rest?

> > It's not *that* important, but a nice feature.

> The easiest way would be that the mkunidata scripts just ignore the missing

> files, like in the patch attached.

> On the other hand, in my original scripts I had the additional fallback

> option to download them from unicode.org on demand.

> If that's acceptable in a build process, I could add that back in.

> Or just drop those rules and stay explicit?


Implicit would be nice, it's not actually a hard requirement.

We must not rely on system-installed files outside the source tree,
build tree or outside the toolchain.

Please provide a patch removing any ln to files under /usr/share, etc.
The explicit rules add a rule to download the Unicode.txt file into the
src tree.  That's the only file we may rely on.  If it doesn't exist,
an implicit rule should not break the build.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Thomas Wolff March 13, 2018, 1:56 p.m. | #11
On 13.03.2018 11:46, Corinna Vinschen wrote:
> On Mar 12 20:32, Thomas Wolff wrote:

>> Am 12.03.2018 um 15:21 schrieb Corinna Vinschen:

>>> A build dependency to unicode-ucd should not be required.  Why not have

>>> automatic build rules which are simply skipped if the file doesn't

>>> exist?  So somebody can download the file and the rules do the rest?

>>> It's not *that* important, but a nice feature.

>> The easiest way would be that the mkunidata scripts just ignore the missing

>> files, like in the patch attached.

>> On the other hand, in my original scripts I had the additional fallback

>> option to download them from unicode.org on demand.

>> If that's acceptable in a build process, I could add that back in.

>> Or just drop those rules and stay explicit?

> Implicit would be nice, it's not actually a hard requirement.

>

> We must not rely on system-installed files outside the source tree,

> build tree or outside the toolchain.

That's why my patch yesterday would ignore missing of the files.

> Please provide a patch removing any ln to files under /usr/share, etc.

Is dependency to one more cygwin package (like obscure ones like "cocom" 
already) a problem?
Is "ln -s" the culprit, so would "cp" be OK?

> The explicit rules add a rule to download the Unicode.txt file into the

> src tree.  That's the only file we may rely on.  If it doesn't exist,

> an implicit rule should not break the build.

Well, my mentioned patch wouldn't break it anymore. I can tweak the 
script further to your preference, but if a solution would involve 
further Makefile changes (like "phony" usage), I would ask you to do 
that. Adding to the previous build problem I had, it may have been 
caused after I recreated the target dir Makefile locally with "make 
Makefile", in order to avoid the 2-hours-turnaround cycle that I'd 
otherwise have with building cygwin.
If we don't agree otherwise, just drop the "implicit" rules patch.
Thanks
Thomas
Corinna Vinschen March 13, 2018, 3:43 p.m. | #12
On Mar 13 14:56, Thomas Wolff wrote:
> On 13.03.2018 11:46, Corinna Vinschen wrote:

> > On Mar 12 20:32, Thomas Wolff wrote:

> > > Am 12.03.2018 um 15:21 schrieb Corinna Vinschen:

> > > > A build dependency to unicode-ucd should not be required.  Why not have

> > > > automatic build rules which are simply skipped if the file doesn't

> > > > exist?  So somebody can download the file and the rules do the rest?

> > > > It's not *that* important, but a nice feature.

> > > The easiest way would be that the mkunidata scripts just ignore the missing

> > > files, like in the patch attached.

> > > On the other hand, in my original scripts I had the additional fallback

> > > option to download them from unicode.org on demand.

> > > If that's acceptable in a build process, I could add that back in.

> > > Or just drop those rules and stay explicit?

> > Implicit would be nice, it's not actually a hard requirement.

> > 

> > We must not rely on system-installed files outside the source tree,

> > build tree or outside the toolchain.

> That's why my patch yesterday would ignore missing of the files.

> 

> > Please provide a patch removing any ln to files under /usr/share, etc.

> Is dependency to one more cygwin package (like obscure ones like "cocom"

> already) a problem?

> Is "ln -s" the culprit, so would "cp" be OK?


The cocom requirement is in Cygwin, not in newlib.  Please keep in mind
that newlib is not dependent on Cygwin but vice versa.  The newlib build
shouldn't make assumptions we can get away with for Cygwin.

You can also never make assumptions about the age of a file provided by
your build OS.  There's a good chance the OS is providing an older
Unicode data file version than the one you actually want to build the
dependent files from.  Worst case, older than the currently supported
Unicode version.

The bottom line is, the Unicode data file has been either downloaded into
the expected location in libc/ctype, or it's not available.

> > The explicit rules add a rule to download the Unicode.txt file into the

> > src tree.  That's the only file we may rely on.  If it doesn't exist,

> > an implicit rule should not break the build.

> Well, my mentioned patch wouldn't break it anymore. I can tweak the script

> further to your preference, but if a solution would involve further Makefile

> changes (like "phony" usage), I would ask you to do that.


I have enough other stuff on my plate, so, no.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat
Thomas Wolff March 13, 2018, 6:24 p.m. | #13
Am 13.03.2018 um 16:43 schrieb Corinna Vinschen:
> On Mar 13 14:56, Thomas Wolff wrote:

>> On 13.03.2018 11:46, Corinna Vinschen wrote:

>>> On Mar 12 20:32, Thomas Wolff wrote:

>>>> Am 12.03.2018 um 15:21 schrieb Corinna Vinschen:

>>>>> A build dependency to unicode-ucd should not be required.  Why not have

>>>>> automatic build rules which are simply skipped if the file doesn't

>>>>> exist?  So somebody can download the file and the rules do the rest?

>>>>> It's not *that* important, but a nice feature.

>>>> The easiest way would be that the mkunidata scripts just ignore the missing

>>>> files, like in the patch attached.

>>>> On the other hand, in my original scripts I had the additional fallback

>>>> option to download them from unicode.org on demand.

>>>> If that's acceptable in a build process, I could add that back in.

>>>> Or just drop those rules and stay explicit?

>>> Implicit would be nice, it's not actually a hard requirement.

>>>

>>> We must not rely on system-installed files outside the source tree,

>>> build tree or outside the toolchain.

>> That's why my patch yesterday would ignore missing of the files.

>>> Please provide a patch removing any ln to files under /usr/share, etc.

>> ...

> ...

> You can also never make assumptions about the age of a file provided by

> your build OS.  There's a good chance the OS is providing an older

> Unicode data file version than the one you actually want to build the

> dependent files from.  Worst case, older than the currently supported

> Unicode version.

>

> The bottom line is, the Unicode data file has been either downloaded into

> the expected location in libc/ctype, or it's not available.

>

>>> The explicit rules add a rule to download the Unicode.txt file into the

>>> src tree.  That's the only file we may rely on.  If it doesn't exist,

>>> an implicit rule should not break the build.

>> Well, my mentioned patch wouldn't break it anymore.

Actually it was broken and would still abort if unicode-ucd isn't installed.
And the contributed version of string/mkunidata is broken, too; it does 
nothing because it contains an "exit" as a test artefact.
Sorry for this embarassing incident.

>> I can tweak the script further to your preference, ...

The attached patch should be sufficient for various requirements. See 
extended commit description.
Thomas
From 4f7df495bad81fffa858c4f27c5fbd709e40b7b3 Mon Sep 17 00:00:00 2001
From: Thomas Wolff <mintty@users.noreply.github.com>
Date: Tue, 13 Mar 2018 18:26:19 +0100
Subject: [PATCH] fix/enhance Unicode table generation scripts

Scripts do not try to acquire Unicode data by best-effort magic anymore.
Options supported:
-h for help
-i to copy Unicode data from /usr/share/unicode/ucd first
-u to download Unicode data from unicode.org first
If (despite of -i or -u if given) the necessary Unicode files are not
available locally, table generation is skipped, but no error code is
returned, so not to obstruct the build process if called from a Makefile.
---
 newlib/libc/ctype/mkunidata  | 34 ++++++++++++++++++++++++++--------
 newlib/libc/string/mkunidata | 37 +++++++++++++++++++++++++++----------
 2 files changed, 53 insertions(+), 18 deletions(-)

diff --git a/newlib/libc/ctype/mkunidata b/newlib/libc/ctype/mkunidata
index ea18e67..4bdf3bc 100755
--- a/newlib/libc/ctype/mkunidata
+++ b/newlib/libc/ctype/mkunidata
@@ -1,6 +1,6 @@
 #! /bin/sh
 
-echo generating Unicode character properties data for newlib/libc/ctype
+echo Generating Unicode character properties data for newlib/libc/ctype
 
 cd `dirname $0`
 
@@ -8,23 +8,41 @@ cd `dirname $0`
 # checks and (with option -u) download
 
 case "$1" in
+-h)	echo "Usage: $0 [-h|-u|-i]"
+	echo "Generate case conversion table caseconv.t and character category table categories.t"
+	echo "from local Unicode file UnicodeData.txt."
+	echo ""
+	echo "Options:"
+	echo "  -u    download file from unicode.org first"
+	echo "  -i    copy file from /usr/share/unicode/ucd first"
+	echo "  -h    show this"
+	exit
+	;;
 -u)
-	#WGET=wget -N -t 1 --timeout=55
-	WGET=curl -R -O --connect-timeout 55
-	WGET+=-z $@
+	wget () {
+		curl -R -O --connect-timeout 55 -z "`basename $1`" "$1"
+	}
 
 	echo downloading data from unicode.org
 	for data in UnicodeData.txt
-	do	$WGET http://unicode.org/Public/UNIDATA/$data
+	do	wget http://unicode.org/Public/UNIDATA/$data
 	done
 	;;
-*)	echo checking package unicode-ucd
-	grep unicode-ucd /etc/setup/installed.db || exit 9
+-i)
+	echo copying data from /usr/share/unicode/ucd
+	for data in UnicodeData.txt
+	do	cp /usr/share/unicode/ucd/$data .
+	done
 	;;
 esac
 
+echo checking Unicode data file
 for data in UnicodeData.txt
-do	test -r $data || ln -s /usr/share/unicode/ucd/$data . || exit 9
+do	if [ -r $data ]
+	then	true
+	else	echo $data not available, skipping table generation
+		exit
+	fi
 done
 
 #############################################################################
diff --git a/newlib/libc/string/mkunidata b/newlib/libc/string/mkunidata
index c0bf5de..7ebebeb 100755
--- a/newlib/libc/string/mkunidata
+++ b/newlib/libc/string/mkunidata
@@ -1,6 +1,6 @@
 #! /bin/sh
 
-echo generating Unicode width data for newlib/libc/string/wcwidth.c
+echo Generating Unicode width data for newlib/libc/string/wcwidth.c
 
 cd `dirname $0`
 PATH="$PATH":.	# ensure access to uniset tool
@@ -9,34 +9,51 @@ PATH="$PATH":.	# ensure access to uniset tool
 # checks and (with option -u) downloads
 
 case "$1" in
+-h)	echo "Usage: $0 [-h|-u|-i]"
+	echo "Generate width data tables ambiguous.t, combining.t, wide.t"
+	echo "from local Unicode files UnicodeData.txt, Blocks.txt, EastAsianWidth.txt."
+	echo ""
+	echo "Options:"
+	echo "  -u    download files from unicode.org first, download uniset tool"
+	echo "  -i    copy files from /usr/share/unicode/ucd first"
+	echo "  -h    show this"
+	exit
+	;;
 -u)
-	#WGET=wget -N -t 1 --timeout=55
-	WGET=curl -R -O --connect-timeout 55
-	WGET+=-z $@
+	wget () {
+		curl -R -O --connect-timeout 55 -z "`basename $1`" "$1"
+	}
 
 	echo downloading uniset tool
-	$WGET http://www.cl.cam.ac.uk/~mgk25/download/uniset.tar.gz
+	wget http://www.cl.cam.ac.uk/~mgk25/download/uniset.tar.gz
 	gzip -dc uniset.tar.gz | tar xvf - uniset
 
 	echo downloading data from unicode.org
 	for data in UnicodeData.txt Blocks.txt EastAsianWidth.txt
-	do	$WGET http://unicode.org/Public/UNIDATA/$data
+	do	wget http://unicode.org/Public/UNIDATA/$data
 	done
 	;;
-*)	echo checking package unicode-ucd
-	grep unicode-ucd /etc/setup/installed.db || exit 9
+-i)
+	echo copying data from /usr/share/unicode/ucd
+	for data in UnicodeData.txt Blocks.txt EastAsianWidth.txt
+	do	cp /usr/share/unicode/ucd/$data .
+	done
 	;;
 esac
 
 echo checking uniset tool
 type uniset || exit 9
 
+echo checking Unicode data files
 for data in UnicodeData.txt Blocks.txt EastAsianWidth.txt
-do	test -r $data || ln -s /usr/share/unicode/ucd/$data . || exit 9
+do	if [ -r $data ]
+	then	true
+	else	echo $data not available, skipping table generation
+		exit
+	fi
 done
 
 echo generating from Unicode version `sed -e 's,[^.0-9],,g' -e 1q Blocks.txt`
-exit
 
 #############################################################################
 # table generation
Corinna Vinschen March 14, 2018, 9:47 a.m. | #14
On Mar 13 19:24, Thomas Wolff wrote:
> Am 13.03.2018 um 16:43 schrieb Corinna Vinschen:

> > You can also never make assumptions about the age of a file provided by

> > your build OS.  There's a good chance the OS is providing an older

> > Unicode data file version than the one you actually want to build the

> > dependent files from.  Worst case, older than the currently supported

> > Unicode version.

> > 

> > The bottom line is, the Unicode data file has been either downloaded into

> > the expected location in libc/ctype, or it's not available.

> > 

> > > > The explicit rules add a rule to download the Unicode.txt file into the

> > > > src tree.  That's the only file we may rely on.  If it doesn't exist,

> > > > an implicit rule should not break the build.

> > > Well, my mentioned patch wouldn't break it anymore.

> Actually it was broken and would still abort if unicode-ucd isn't installed.

> And the contributed version of string/mkunidata is broken, too; it does

> nothing because it contains an "exit" as a test artefact.

> Sorry for this embarassing incident.


Ok, that's fine, I pushed the patch.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat

Patch

diff --git a/newlib/libc/ctype/Makefile.am b/newlib/libc/ctype/Makefile.am
index fa6a70d..714b333 100644
--- a/newlib/libc/ctype/Makefile.am
+++ b/newlib/libc/ctype/Makefile.am
@@ -135,3 +135,23 @@  CHEWOUT_FILES= \
 CHAPTERS = ctype.tex
 
 $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# case conversion and character category data for libc/ctype/??w*.c
+
+categories.c:	$(srcdir)/categories.t
+
+towctrans_l.c:	$(srcdir)/caseconv.t
+
+$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcategories
+
+$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcaseconv
+
diff --git a/newlib/libc/ctype/Makefile.in b/newlib/libc/ctype/Makefile.in
index 9932a94..ffcb384 100644
--- a/newlib/libc/ctype/Makefile.in
+++ b/newlib/libc/ctype/Makefile.in
@@ -1158,3 +1158,23 @@  $(lpfx)ctype_.$(oext): ctype_.c ctype_iso.h ctype_cp.h
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# case conversion and character category data for libc/ctype/??w*.c
+
+categories.c:	$(srcdir)/categories.t
+
+towctrans_l.c:	$(srcdir)/caseconv.t
+
+$(srcdir)/categories.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcategories
+
+$(srcdir)/caseconv.t:	$(srcdir)/UnicodeData.txt
+	cd $(srcdir); sh ./mkcaseconv
+
diff --git a/newlib/libc/string/Makefile.am b/newlib/libc/string/Makefile.am
index 49de080..2617112 100644
--- a/newlib/libc/string/Makefile.am
+++ b/newlib/libc/string/Makefile.am
@@ -168,3 +168,27 @@  wcscasecmp_l.def wcscoll_l.def	wcsncasecmp_l.def wcsxfrm_l.def \
 strverscmp.def	strnstr.def	wmempcpy.def
 
 CHAPTERS = strings.tex wcstrings.tex
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# width data for libc/string/wcwidth.c
+
+wcwidth.c:	$(srcdir)/ambiguous.t $(srcdir)/combining.t $(srcdir)/wide.t
+
+$(srcdir)/combining.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +cat=Me +cat=Mn +cat=Cf -00AD +1160-11FF +200B +D7B0-D7C6 +D7CB-D7FB c > combining.t
+
+$(srcdir)/WIDTH-A:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwidthA
+
+$(srcdir)/ambiguous.t:	$(srcdir)/WIDTH-A $(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +WIDTH-A -cat=Me -cat=Mn -cat=Cf c > ambiguous.t
+
+$(srcdir)/wide.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwide
+
diff --git a/newlib/libc/string/Makefile.in b/newlib/libc/string/Makefile.in
index eb8fafc..3758f71 100644
--- a/newlib/libc/string/Makefile.in
+++ b/newlib/libc/string/Makefile.in
@@ -1416,3 +1416,27 @@  docbook: $(DOCBOOK_OUT_FILES)
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:
+
+#############################################################################
+# Unicode data
+
+$(srcdir)/%.txt:
+	cd $(srcdir); test -r $(notdir $@) || ln -s /usr/share/unicode/ucd/$(notdir $@) .
+
+#############################################################################
+# width data for libc/string/wcwidth.c
+
+wcwidth.c:	$(srcdir)/ambiguous.t $(srcdir)/combining.t $(srcdir)/wide.t
+
+$(srcdir)/combining.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +cat=Me +cat=Mn +cat=Cf -00AD +1160-11FF +200B +D7B0-D7C6 +D7CB-D7FB c > combining.t
+
+$(srcdir)/WIDTH-A:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwidthA
+
+$(srcdir)/ambiguous.t:	$(srcdir)/WIDTH-A $(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt
+	cd $(srcdir); ./uniset +WIDTH-A -cat=Me -cat=Mn -cat=Cf c > ambiguous.t
+
+$(srcdir)/wide.t:	$(srcdir)/UnicodeData.txt $(srcdir)/Blocks.txt $(srcdir)/EastAsianWidth.txt
+	cd $(srcdir); sh ./mkwide
+