nano malloc allocator algorithm improvement

Message ID a742986b-4c4d-6d59-6433-938bc471ae7a@thingsconnected.nl
State New
Headers show
Series
  • nano malloc allocator algorithm improvement
Related show

Commit Message

The current nano malloc implementation has two issues reducing the amount of memory available and increasing fragmentation.

The first issue is that sbrk() will be called to allocate a space with the size of the entire requested alloc_size, even if the last free chunk borders the edge of currently allocated memory. This means that in a system with 20 kb of RAM, you will get ENOMEM when performing this:

tmpbuf = malloc(10*1024);
free(tmpbuf);
tmpbuf2 = malloc(11*1024); // will fail on a 20 kb RAM system

It would be better if malloc would detect if the last free chunk can be simply grown to alloc_size instead. (Or if free() would return the memory to the system by sbrk(-10kb), which is currently impossible.)

The second issue is that a free chunk that is oversized will be cut up in two pieces, where the *second* piece is used for allocation and the first one is kept as a free chunk. Although this is easier/shorter in code (because the free list remains unaffected apart from the size of the free chunk) it leads to an inefficient pattern of memory allocation, and ultimately in fragmentation and slower malloc/free calls.

As an example, again in a system with 20kb, if the above issue would be patched, you will still get ENOMEN when performing this:

tmpbuf = malloc(10*1024);
free(tmpbuf);
tmpbuf2 = malloc(1); // this will end up in the middle of RAM space
tmpbuf3 = malloc(15*1024); // will fail on a 20 kb RAM system

Total impact on binary size (on ARM Thumb) is ~100 bytes.
---
  newlib/libc/stdlib/nano-mallocr.c | 41 ++++++++++++++++++++++++++++---
  1 file changed, 38 insertions(+), 3 deletions(-)

-- 
2.23.0

Comments

Torbjorn SVENSSON via Newlib Aug. 22, 2020, 10:57 p.m. | #1
Maarten van der Schrieck | Things Connected <maarten@thingsconnected.nl>
writes:

> The current nano malloc implementation has two issues reducing the amount of memory available and increasing fragmentation.

>

> The first issue is that sbrk() will be called to allocate a space with

> the size of the entire requested alloc_size, even if the last free

> chunk borders the edge of currently allocated memory. This means that

> in a system with 20 kb of RAM, you will get ENOMEM when performing

> this:


Oh, that's a great idea. I did the same for realloc when the block was
at the end of the heap; doing the same for malloc is a nice
addition.

> The second issue is that a free chunk that is oversized will be cut up

> in two pieces, where the *second* piece is used for allocation and the

> first one is kept as a free chunk. Although this is easier/shorter in

> code (because the free list remains unaffected apart from the size of

> the free chunk) it leads to an inefficient pattern of memory

> allocation, and ultimately in fragmentation and slower malloc/free

> calls.


I re-worked the list management to use a different pattern that makes
this change much less invasive.

https://github.com/picolibc/picolibc/commit/fd2d18bb5ab442f16789c243648d07b4ec8e2b29

-- 
-keith
Torbjorn SVENSSON via Newlib Aug. 23, 2020, 12:12 a.m. | #2
Keith Packard via Newlib <newlib@sourceware.org> writes:

> Maarten van der Schrieck | Things Connected <maarten@thingsconnected.nl>

> writes:

>

>> The current nano malloc implementation has two issues reducing the amount of memory available and increasing fragmentation.

>>

>> The first issue is that sbrk() will be called to allocate a space with

>> the size of the entire requested alloc_size, even if the last free

>> chunk borders the edge of currently allocated memory. This means that

>> in a system with 20 kb of RAM, you will get ENOMEM when performing

>> this:

>

> Oh, that's a great idea. I did the same for realloc when the block was

> at the end of the heap; doing the same for malloc is a nice

> addition.


I've gone ahead and added this. Because this code is shared with
realloc, the overall impact on the code size is pretty modest. I think
it's easily worth the increase in code size because it will use ram more
efficiently now.

I can back-port this code to newlib if people are interested; certainly
having people on this list review the could would be great.

https://github.com/picolibc/picolibc/blob/main/newlib/libc/stdlib/nano-mallocr.c

-- 
-keith
Torbjorn SVENSSON via Newlib Aug. 24, 2020, 10:03 a.m. | #3
On Aug 22 17:12, Keith Packard via Newlib wrote:
> Keith Packard via Newlib <newlib@sourceware.org> writes:

> 

> > Maarten van der Schrieck | Things Connected <maarten@thingsconnected.nl>

> > writes:

> >

> >> The current nano malloc implementation has two issues reducing the amount of memory available and increasing fragmentation.

> >>

> >> The first issue is that sbrk() will be called to allocate a space with

> >> the size of the entire requested alloc_size, even if the last free

> >> chunk borders the edge of currently allocated memory. This means that

> >> in a system with 20 kb of RAM, you will get ENOMEM when performing

> >> this:

> >

> > Oh, that's a great idea. I did the same for realloc when the block was

> > at the end of the heap; doing the same for malloc is a nice

> > addition.

> 

> I've gone ahead and added this. Because this code is shared with

> realloc, the overall impact on the code size is pretty modest. I think

> it's easily worth the increase in code size because it will use ram more

> efficiently now.

> 

> I can back-port this code to newlib if people are interested; certainly

> having people on this list review the could would be great.


The review should take place on the newlib list ;)


Thanks,
Corinna

Patch

diff --git a/newlib/libc/stdlib/nano-mallocr.c b/newlib/libc/stdlib/nano-mallocr.c
index 6ba0eb74b..9d3462861 100644
--- a/newlib/libc/stdlib/nano-mallocr.c
+++ b/newlib/libc/stdlib/nano-mallocr.c
@@ -258,15 +258,50 @@  void * nano_malloc(RARG malloc_size_t s)
      while (r)
      {
          int rem = r->size - alloc_size;
+
+        /* To prevent a trailing but small chunk to be left unused, grow
+         * the available memory and the chunk instead of allocating
+         * alloc_size */
+        if (rem < 0 && r->next == NULL)
+        {
+            /* This is the last chunk, check if there is any allocated
+             * memory after it. */
+            chunk *heap_end = sbrk_aligned(RCALL 0);
+            if ((char *)r + r->size == heap_end)
+            {
+                /* Attempt to allocate the additional memory needed and
+                 * go on as usual */
+                if (sbrk_aligned(RCALL -rem) == (void*)-1)
+                {
+                    RERRNO = ENOMEM;
+                    MALLOC_UNLOCK;
+                    return NULL;
+                }
+                rem = 0;
+                r->size = alloc_size;
+                /* Fall through to regular logic below */
+            }
+        }
          if (rem >= 0)
          {
              if (rem >= MALLOC_MINCHUNK)
              {
                  /* Find a chunk that much larger than required size, break
-                * it into two chunks and return the second one */
-                r->size = rem;
-                r = (chunk *)((char *)r + rem);
+                 * it into two chunks and return the first one.
+                 * Returning the second part would be a bit easier, because
+                 * the chunk list stays intact, but doing that leads to
+                 * memory fragmentation quickly.
+                 */
+                chunk *part2 = (chunk *)((char *)r + alloc_size);
+                part2->size = rem;
+                part2->next = r->next;
                  r->size = alloc_size;
+                if ( p == r )
+                {
+                    free_list = part2;
+                } else {
+                    p->next = part2;
+                }
              }
              /* Find a chunk that is exactly the size or slightly bigger
               * than requested size, just return this chunk */