[1/4] arm: Fix armv7 neon memchr on ARM mode

Message ID 1523481378-16290-1-git-send-email-adhemerval.zanella@linaro.org
State New
Headers show
  • [1/4] arm: Fix armv7 neon memchr on ARM mode
Related show

Commit Message

Adhemerval Zanella April 11, 2018, 9:16 p.m.
Current optimized armv7 neon memchr uses the NO_THUMB wrongly to
conditionalize thumb instruction usage.  The flags is meant to be
defined before sysdep.h inclusion and to indicate the assembly
requires to build in ARM mode, not to check whether thumb is
enable or not.  This patch fixes it by using the GCC provided
'__thumb__' instead.

Also, even if the implementation is fixed to not use thumb instructions
it was clearly not proper checked in ARM mode: the carry bit flag will
be reset in previous 'cmp synd, #0' and thus the 'bhi cntin, #0' won't
be able to branch correctly if the loop finishes with 'cntin' being
negative (indicating that some bytes still require to be checked).
This patch also fixes it by checking the carry flag in previous loop
iteration directly (in ARM mode it will run both '.Lmasklast' and
'.Ltail' even if no byte is found in last loop iteration).

Checked on arm-linux-gnueabihf (with -marm and -mthumb mode).

	[BZ #23031]
	* sysdeps/arm/armv7/multiarch/memchr_neon.S (memchr): Fix tail check
	on ARM mode.
	(NO_THUMB): Check __thumb__ instead.
 ChangeLog                                 | 7 +++++++
 sysdeps/arm/armv7/multiarch/memchr_neon.S | 9 +++------
 2 files changed, 10 insertions(+), 6 deletions(-)



diff --git a/sysdeps/arm/armv7/multiarch/memchr_neon.S b/sysdeps/arm/armv7/multiarch/memchr_neon.S
index 1b2ae75..1b2a69d 100644
--- a/sysdeps/arm/armv7/multiarch/memchr_neon.S
+++ b/sysdeps/arm/armv7/multiarch/memchr_neon.S
@@ -68,7 +68,7 @@ 
  * allows to identify exactly which byte has matched.
-#ifndef NO_THUMB
+#ifdef __thumb__
@@ -132,7 +132,7 @@  ENTRY(memchr)
 	/* The first block can also be the last */
 	bls		.Lmasklast
 	/* Have we found something already? */
-#ifndef NO_THUMB
+#ifdef __thumb__
 	cbnz		synd, .Ltail
 	cmp		synd, #0
@@ -176,14 +176,11 @@  ENTRY(memchr)
 	vpadd.i8	vdata0_0, vdata0_0, vdata1_0
 	vpadd.i8	vdata0_0, vdata0_0, vdata0_0
 	vmov		synd, vdata0_0[0]
-#ifndef NO_THUMB
+#ifdef __thumb__
 	cbz		synd, .Lnotfound
 	bhi		.Ltail	/* Uses the condition code from
 				   subs cntin, cntin, #32 above.  */
-	cmp		synd, #0
-	beq		.Lnotfound
-	cmp		cntin, #0
 	bhi		.Ltail