[RFC] RISC-V: PR27916, Support mapping symbols.

Message ID 20210529021818.30276-1-nelson.chu@sifive.com
State New
Headers show
Series
  • [RFC] RISC-V: PR27916, Support mapping symbols.
Related show

Commit Message

Nelson Chu May 29, 2021, 2:18 a.m.
Similar to ARM/AARCH64, we add mapping symbols in the symbol table,
to mark the start addresses of data and instructions.  The $d means data,
and the $x means instruction.  Then the disassembler uses these symbols
to decide whether we should dump data or instruction.  Most of the
implementation are ported from ARM/AARCH64, but there are some difference
and improvement as follow,

* We store all mapping symbols of the fragment, rather than the first and
last one.  This helps to clean the more redundant mapping symbols (added
by riscv_add_odd_padding_symbol) in the riscv_check_mapping_symbols.
Consider the case,

$cat tmp.s
.option norelax
.option rvc
.byte 1
.align 2
nop
$riscv64-unknown-elf-as tmp.s -o tmp.o
$riscv64-unknown-elf-readelf -Ws tmp.o
4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 $d
5: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT    1 $x
$riscv64-unknown-elf-objdump -d tmp.o
Disassembly of section .text:

0000000000000000 <.text>:
   0:   0000            .short  0x0000
   2:   0001                    nop
   4:   0001                    nop
   6:   0001                    nop

But AARCH64 will add two consecutive $d, which seems redundant.
Besides, they seems to treat the alignment as DATA, but riscv
usually fill the alignment spaces by nops, so I think we can treat
those spaces as instrcutions.

$cat tmp2.s
.byte 1
.align 2
nop
$aarch64-linux-gnu-as tmp2.s -o tmp2.o
$aarch64-linux-gnu-readelf -Ws tmp2.o
4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 $d
5: 0000000000000001     0 NOTYPE  LOCAL  DEFAULT    1 $d
6: 0000000000000004     0 NOTYPE  LOCAL  DEFAULT    1 $x
$aarch64-linux-gnu-objdump -d tmp2.o
Disassembly of section .text:

0000000000000000 <.text>:
   0:   00              .byte   0x00
   1:   00              .byte   0x00
   2:   0000            .short  0x0000
   4:   d503201f        nop

* Compared to AARCH64, riscv_frag_align_code use frag_more to get the more
spaces for alignment, but they use frag_var to get the spaces.  The
riscv_init_frag is called by frag_var, and used to init the alignment mapping
state for a new fragment.  Therefore, riscv should also use the frag_var,
or add the mapping symbol manually in the riscv_frag_align_code.  I choose
the latter, since it won't change too much and seems more safe.

TODOS,

* Maybe we can add the mapping symbols with the data size, to show more
details about data.  For example, $d4 to show a word, and $d2 to show byte.
Besides, for the case,
.byte 1
.word 0

For now we only add $d at the start of .byte.  But if we mark the mapping
symbols as follows,
$d1
$d4
Then we can show the code that is closer to the user's original code.

* Need to add new directive, .inst 0x00000013, to show this is encoded as an
instruction, rather than a data.

* Consider RVC to NORVC, and DATA to INSN, we probably need to add the
alignments automatically by assembler.  But in fact the things that need
to be dealt with are more complicated than imagined.

* Consider the testcase mapping-norelax-04b.s, there is a c.nop which is
used to do the alignment, but the rvc extension is disabled.  This may
looks weird and confuse user, so maybe we should treat these alignments
as MAP_DATA rather than MAP_INSN?  Or maybe treat as a new MAP_ALIGNMNET?

bfd/
    pr 27916
    * cpu-riscv.c (riscv_elf_is_mapping_symbols): Define mapping symbols.
    * cpu-riscv.h: extern riscv_elf_is_mapping_symbols.
    * elfnn-riscv.c (riscv_maybe_function_sym): Do not choose mapping
    symbols as a function name.
    (riscv_elf_is_target_special_symbol): Add mapping symbols.
gas/
    pr 27916
    * config/tc-riscv.c (make_mapping_symbol): Create a new mapping symbol.
    (riscv_mapping_state): Decide whether to create mapping symbol for
    frag_now.
    (riscv_add_odd_padding_symbol): Add the odd bytes of paddings for
    riscv_handle_align.
    (riscv_check_mapping_symbols): Remove all overlapped and redundant
    mapping symbols.
    (md_assemble): Marked as instruction.
    (riscv_frag_align_code): Marked as instruction.
    (riscv_handle_align): Add mapping symbols for odd padding.
    (riscv_init_frag): Add mapping symbols for frag, it usually called
    by frag_var.
    (s_riscv_insn): Marked as instruction.
    (riscv_adjust_symtab): Call riscv_check_mapping_symbols for each section.
    * config/tc-riscv.h (md_cons_align): Defined to riscv_mapping_state
    with MAP_DATA.
    (TC_SEGMENT_INFO_TYPE): Record mapping state for each segment.
    (riscv_frag_mapping_symbol, TC_FRAG_TYPE): Record all mapping symbols
    information for a fragment.
    * testsuite/gas/riscv/mapping-01.s: New testcase.
    * testsuite/gas/riscv/mapping-01a.d: Likewise.
    * testsuite/gas/riscv/mapping-01b.d: Likewise.
    * testsuite/gas/riscv/mapping-02.s: Likewise.
    * testsuite/gas/riscv/mapping-02a.d: Likewise.
    * testsuite/gas/riscv/mapping-02b.d: Likewise.
    * testsuite/gas/riscv/mapping-03.s: Likewise.
    * testsuite/gas/riscv/mapping-03a.d: Likewise.
    * testsuite/gas/riscv/mapping-03b.d: Likewise.
    * testsuite/gas/riscv/mapping-04.s: Likewise.
    * testsuite/gas/riscv/mapping-04a.d: Likewise.
    * testsuite/gas/riscv/mapping-04b.d: Likewise.
    * testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.
    * testsuite/gas/riscv/mapping-norelax-04b.d: Likewise.
    * testsuite/gas/riscv/no-relax-align-2.d: Updated.
include/
    pr 27916
    * opcode/riscv.h (enum riscv_seg_mstate): Added.
opcodes/
    pr 27916
    * riscv-dis.c (last_map_symbol): The last mapping symbol number.
    (last_stop_offset):
    (last_map_state): MAP_DATA or MAP_INSN.
    (riscv_get_map_state): Get the mapping state from the symbol.
    (riscv_search_mapping_symbol): Check the sorted symbol table, and
    then find the suitable mapping symbol.
    (riscv_data_length): Decide which data size we should print.
    (riscv_disassemble_data): Dump the data contents.
    (print_insn_riscv): Handle the mapping symbols.
    (riscv_symbol_is_valid): Marked mapping symbols as invalid.
binutils/
    pr 27916
    * testsuite/binutils-all/readelf.s: Updated.
    * testsuite/binutils-all/readelf.s-64: Likewise.
    * testsuite/binutils-all/readelf.s-64-unused: Likewise.
    * testsuite/binutils-all/readelf.ss: Likewise.
    * testsuite/binutils-all/readelf.ss-64: Likewise.
    * testsuite/binutils-all/readelf.ss-64-unused: Likewise.
---
 bfd/cpu-riscv.c                               |   9 +
 bfd/cpu-riscv.h                               |   3 +
 bfd/elfnn-riscv.c                             |  18 +-
 binutils/testsuite/binutils-all/readelf.s     |   1 +
 binutils/testsuite/binutils-all/readelf.s-64  |   5 +-
 .../binutils-all/readelf.s-64-unused          |   5 +-
 binutils/testsuite/binutils-all/readelf.ss    |   2 +
 binutils/testsuite/binutils-all/readelf.ss-64 |   8 +-
 .../binutils-all/readelf.ss-64-unused         |   8 +-
 gas/config/tc-riscv.c                         | 243 +++++++++++++++++-
 gas/config/tc-riscv.h                         |  30 +++
 gas/testsuite/gas/riscv/mapping-01.s          |  18 ++
 gas/testsuite/gas/riscv/mapping-01a.d         |  18 ++
 gas/testsuite/gas/riscv/mapping-01b.d         |  21 ++
 gas/testsuite/gas/riscv/mapping-02.s          |  13 +
 gas/testsuite/gas/riscv/mapping-02a.d         |  16 ++
 gas/testsuite/gas/riscv/mapping-02b.d         |  16 ++
 gas/testsuite/gas/riscv/mapping-03.s          |  10 +
 gas/testsuite/gas/riscv/mapping-03a.d         |  16 ++
 gas/testsuite/gas/riscv/mapping-03b.d         |  16 ++
 gas/testsuite/gas/riscv/mapping-04.s          |  11 +
 gas/testsuite/gas/riscv/mapping-04a.d         |  21 ++
 gas/testsuite/gas/riscv/mapping-04b.d         |  24 ++
 gas/testsuite/gas/riscv/mapping-norelax-04a.d |  21 ++
 gas/testsuite/gas/riscv/mapping-norelax-04b.d |  24 ++
 gas/testsuite/gas/riscv/no-relax-align-2.d    |   2 +-
 include/opcode/riscv.h                        |   7 +
 opcodes/riscv-dis.c                           | 224 +++++++++++++++-
 28 files changed, 787 insertions(+), 23 deletions(-)
 create mode 100644 gas/testsuite/gas/riscv/mapping-01.s
 create mode 100644 gas/testsuite/gas/riscv/mapping-01a.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-01b.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-02.s
 create mode 100644 gas/testsuite/gas/riscv/mapping-02a.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-02b.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-03.s
 create mode 100644 gas/testsuite/gas/riscv/mapping-03a.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-03b.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-04.s
 create mode 100644 gas/testsuite/gas/riscv/mapping-04a.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-04b.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-norelax-04a.d
 create mode 100644 gas/testsuite/gas/riscv/mapping-norelax-04b.d

-- 
2.30.2

Comments

Palmer Dabbelt June 3, 2021, 4:18 a.m. | #1
On Fri, 28 May 2021 19:18:18 PDT (-0700), nelson.chu@sifive.com wrote:
> Similar to ARM/AARCH64, we add mapping symbols in the symbol table,

> to mark the start addresses of data and instructions.  The $d means data,

> and the $x means instruction.  Then the disassembler uses these symbols


We picked "__global_pointer$" under the assumption "$" would avoid any 
conflicts with symbols from user code, so these should be safe too.  
IIRC that's a normal way to do these things, and with Arm already having 
those names that seems like a good way to go.

> to decide whether we should dump data or instruction.  Most of the

> implementation are ported from ARM/AARCH64, but there are some difference

> and improvement as follow,

>

> * We store all mapping symbols of the fragment, rather than the first and

> last one.  This helps to clean the more redundant mapping symbols (added

> by riscv_add_odd_padding_symbol) in the riscv_check_mapping_symbols.

> Consider the case,

>

> $cat tmp.s

> .option norelax

> .option rvc

> .byte 1

> .align 2

> nop

> $riscv64-unknown-elf-as tmp.s -o tmp.o

> $riscv64-unknown-elf-readelf -Ws tmp.o

> 4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 $d

> 5: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT    1 $x

> $riscv64-unknown-elf-objdump -d tmp.o

> Disassembly of section .text:

>

> 0000000000000000 <.text>:

>    0:   0000            .short  0x0000

>    2:   0001                    nop

>    4:   0001                    nop

>    6:   0001                    nop

>

> But AARCH64 will add two consecutive $d, which seems redundant.

> Besides, they seems to treat the alignment as DATA, but riscv

> usually fill the alignment spaces by nops, so I think we can treat

> those spaces as instrcutions.

>

> $cat tmp2.s

> .byte 1

> .align 2

> nop

> $aarch64-linux-gnu-as tmp2.s -o tmp2.o

> $aarch64-linux-gnu-readelf -Ws tmp2.o

> 4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 $d

> 5: 0000000000000001     0 NOTYPE  LOCAL  DEFAULT    1 $d

> 6: 0000000000000004     0 NOTYPE  LOCAL  DEFAULT    1 $x

> $aarch64-linux-gnu-objdump -d tmp2.o

> Disassembly of section .text:

>

> 0000000000000000 <.text>:

>    0:   00              .byte   0x00

>    1:   00              .byte   0x00

>    2:   0000            .short  0x0000

>    4:   d503201f        nop

>

> * Compared to AARCH64, riscv_frag_align_code use frag_more to get the more

> spaces for alignment, but they use frag_var to get the spaces.  The

> riscv_init_frag is called by frag_var, and used to init the alignment mapping

> state for a new fragment.  Therefore, riscv should also use the frag_var,

> or add the mapping symbol manually in the riscv_frag_align_code.  I choose

> the latter, since it won't change too much and seems more safe.

>

> TODOS,

>

> * Maybe we can add the mapping symbols with the data size, to show more

> details about data.  For example, $d4 to show a word, and $d2 to show byte.

> Besides, for the case,

> .byte 1

> .word 0

>

> For now we only add $d at the start of .byte.  But if we mark the mapping

> symbols as follows,

> $d1

> $d4

> Then we can show the code that is closer to the user's original code.

>

> * Need to add new directive, .inst 0x00000013, to show this is encoded as an

> instruction, rather than a data.


Maybe just an add explicit 4-byte NOP pseudo instruction or alias?  I 
keep finding myself in the position where I want to insert exactly 4 
bytes of NOP and the automatic compression gets in the way.

> * Consider RVC to NORVC, and DATA to INSN, we probably need to add the

> alignments automatically by assembler.  But in fact the things that need

> to be dealt with are more complicated than imagined.


We've been through this a bunch of times and it always ends up somewhat 
hairy.  There's sort of this balance between trying to avoid inserting 
non-executable instructions -- either a single-byte NOP (for re-aligning 
on bytes) or a two-byte NOP outside of C (for re-aligning when turning 
off C).  IIRC we spent a lot of time trying to do a better job than 
we're currently doing and there were always enough edge cases that we 
decided to just make the user deal with these issues explicitly.

If you've got an idea of how to do a better job I'm all ears, just 
warning that this might be more frusturation than it's worth.

> * Consider the testcase mapping-norelax-04b.s, there is a c.nop which is

> used to do the alignment, but the rvc extension is disabled.  This may

> looks weird and confuse user, so maybe we should treat these alignments

> as MAP_DATA rather than MAP_INSN?  Or maybe treat as a new MAP_ALIGNMNET?


IMO that's the right way to go here.  The C instructions aren't valid 
without the C ISA, so they're not really quite instructions at that 
point.  As above there's always been a bunch of rough edges around this 
sort of thing, though, so it might be hard to make this work without 
causing a headache somewhere else.

> bfd/

>     pr 27916

>     * cpu-riscv.c (riscv_elf_is_mapping_symbols): Define mapping symbols.

>     * cpu-riscv.h: extern riscv_elf_is_mapping_symbols.

>     * elfnn-riscv.c (riscv_maybe_function_sym): Do not choose mapping

>     symbols as a function name.

>     (riscv_elf_is_target_special_symbol): Add mapping symbols.

> gas/

>     pr 27916

>     * config/tc-riscv.c (make_mapping_symbol): Create a new mapping symbol.

>     (riscv_mapping_state): Decide whether to create mapping symbol for

>     frag_now.

>     (riscv_add_odd_padding_symbol): Add the odd bytes of paddings for

>     riscv_handle_align.

>     (riscv_check_mapping_symbols): Remove all overlapped and redundant

>     mapping symbols.

>     (md_assemble): Marked as instruction.

>     (riscv_frag_align_code): Marked as instruction.

>     (riscv_handle_align): Add mapping symbols for odd padding.

>     (riscv_init_frag): Add mapping symbols for frag, it usually called

>     by frag_var.

>     (s_riscv_insn): Marked as instruction.

>     (riscv_adjust_symtab): Call riscv_check_mapping_symbols for each section.

>     * config/tc-riscv.h (md_cons_align): Defined to riscv_mapping_state

>     with MAP_DATA.

>     (TC_SEGMENT_INFO_TYPE): Record mapping state for each segment.

>     (riscv_frag_mapping_symbol, TC_FRAG_TYPE): Record all mapping symbols

>     information for a fragment.

>     * testsuite/gas/riscv/mapping-01.s: New testcase.

>     * testsuite/gas/riscv/mapping-01a.d: Likewise.

>     * testsuite/gas/riscv/mapping-01b.d: Likewise.

>     * testsuite/gas/riscv/mapping-02.s: Likewise.

>     * testsuite/gas/riscv/mapping-02a.d: Likewise.

>     * testsuite/gas/riscv/mapping-02b.d: Likewise.

>     * testsuite/gas/riscv/mapping-03.s: Likewise.

>     * testsuite/gas/riscv/mapping-03a.d: Likewise.

>     * testsuite/gas/riscv/mapping-03b.d: Likewise.

>     * testsuite/gas/riscv/mapping-04.s: Likewise.

>     * testsuite/gas/riscv/mapping-04a.d: Likewise.

>     * testsuite/gas/riscv/mapping-04b.d: Likewise.

>     * testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

>     * testsuite/gas/riscv/mapping-norelax-04b.d: Likewise.

>     * testsuite/gas/riscv/no-relax-align-2.d: Updated.

> include/

>     pr 27916

>     * opcode/riscv.h (enum riscv_seg_mstate): Added.

> opcodes/

>     pr 27916

>     * riscv-dis.c (last_map_symbol): The last mapping symbol number.

>     (last_stop_offset):

>     (last_map_state): MAP_DATA or MAP_INSN.

>     (riscv_get_map_state): Get the mapping state from the symbol.

>     (riscv_search_mapping_symbol): Check the sorted symbol table, and

>     then find the suitable mapping symbol.

>     (riscv_data_length): Decide which data size we should print.

>     (riscv_disassemble_data): Dump the data contents.

>     (print_insn_riscv): Handle the mapping symbols.

>     (riscv_symbol_is_valid): Marked mapping symbols as invalid.

> binutils/

>     pr 27916

>     * testsuite/binutils-all/readelf.s: Updated.

>     * testsuite/binutils-all/readelf.s-64: Likewise.

>     * testsuite/binutils-all/readelf.s-64-unused: Likewise.

>     * testsuite/binutils-all/readelf.ss: Likewise.

>     * testsuite/binutils-all/readelf.ss-64: Likewise.

>     * testsuite/binutils-all/readelf.ss-64-unused: Likewise.

> ---

>  bfd/cpu-riscv.c                               |   9 +

>  bfd/cpu-riscv.h                               |   3 +

>  bfd/elfnn-riscv.c                             |  18 +-

>  binutils/testsuite/binutils-all/readelf.s     |   1 +

>  binutils/testsuite/binutils-all/readelf.s-64  |   5 +-

>  .../binutils-all/readelf.s-64-unused          |   5 +-

>  binutils/testsuite/binutils-all/readelf.ss    |   2 +

>  binutils/testsuite/binutils-all/readelf.ss-64 |   8 +-

>  .../binutils-all/readelf.ss-64-unused         |   8 +-

>  gas/config/tc-riscv.c                         | 243 +++++++++++++++++-

>  gas/config/tc-riscv.h                         |  30 +++

>  gas/testsuite/gas/riscv/mapping-01.s          |  18 ++

>  gas/testsuite/gas/riscv/mapping-01a.d         |  18 ++

>  gas/testsuite/gas/riscv/mapping-01b.d         |  21 ++

>  gas/testsuite/gas/riscv/mapping-02.s          |  13 +

>  gas/testsuite/gas/riscv/mapping-02a.d         |  16 ++

>  gas/testsuite/gas/riscv/mapping-02b.d         |  16 ++

>  gas/testsuite/gas/riscv/mapping-03.s          |  10 +

>  gas/testsuite/gas/riscv/mapping-03a.d         |  16 ++

>  gas/testsuite/gas/riscv/mapping-03b.d         |  16 ++

>  gas/testsuite/gas/riscv/mapping-04.s          |  11 +

>  gas/testsuite/gas/riscv/mapping-04a.d         |  21 ++

>  gas/testsuite/gas/riscv/mapping-04b.d         |  24 ++

>  gas/testsuite/gas/riscv/mapping-norelax-04a.d |  21 ++

>  gas/testsuite/gas/riscv/mapping-norelax-04b.d |  24 ++

>  gas/testsuite/gas/riscv/no-relax-align-2.d    |   2 +-

>  include/opcode/riscv.h                        |   7 +

>  opcodes/riscv-dis.c                           | 224 +++++++++++++++-

>  28 files changed, 787 insertions(+), 23 deletions(-)

>  create mode 100644 gas/testsuite/gas/riscv/mapping-01.s

>  create mode 100644 gas/testsuite/gas/riscv/mapping-01a.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-01b.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-02.s

>  create mode 100644 gas/testsuite/gas/riscv/mapping-02a.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-02b.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-03.s

>  create mode 100644 gas/testsuite/gas/riscv/mapping-03a.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-03b.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-04.s

>  create mode 100644 gas/testsuite/gas/riscv/mapping-04a.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-04b.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-norelax-04a.d

>  create mode 100644 gas/testsuite/gas/riscv/mapping-norelax-04b.d

>

> diff --git a/bfd/cpu-riscv.c b/bfd/cpu-riscv.c

> index 025e94afd34..6056fa0daa0 100644

> --- a/bfd/cpu-riscv.c

> +++ b/bfd/cpu-riscv.c

> @@ -140,3 +140,12 @@ riscv_get_priv_spec_class_from_numbers (unsigned int major,

>    RISCV_GET_PRIV_SPEC_CLASS (buf, class_t);

>    *class = class_t;

>  }

> +

> +/* Define mapping symbols for riscv.  */

> +

> +bool

> +riscv_elf_is_mapping_symbols (const char *name)

> +{

> +  return (!strcmp (name, "$d")

> +          || !strcmp (name, "$x"));

> +}

> diff --git a/bfd/cpu-riscv.h b/bfd/cpu-riscv.h

> index cafaca23be0..ed5ee7e60d5 100644

> --- a/bfd/cpu-riscv.h

> +++ b/bfd/cpu-riscv.h

> @@ -79,3 +79,6 @@ riscv_get_priv_spec_class_from_numbers (unsigned int,

>  					unsigned int,

>  					unsigned int,

>  					enum riscv_spec_class *);

> +

> +extern bool

> +riscv_elf_is_mapping_symbols (const char *);

> diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c

> index d2781f30993..98496c4d3e5 100644

> --- a/bfd/elfnn-riscv.c

> +++ b/bfd/elfnn-riscv.c

> @@ -5106,6 +5106,20 @@ riscv_elf_obj_attrs_arg_type (int tag)

>    return (tag & 1) != 0 ? ATTR_TYPE_FLAG_STR_VAL : ATTR_TYPE_FLAG_INT_VAL;

>  }

>

> +/* Do not choose mapping symbols as a function name.  */

> +

> +static bfd_size_type

> +riscv_maybe_function_sym (const asymbol *sym,

> +			  asection *sec,

> +			  bfd_vma *code_off)

> +{

> +  if (sym->flags & BSF_LOCAL

> +      && riscv_elf_is_mapping_symbols (sym->name))

> +    return 0;

> +

> +  return _bfd_elf_maybe_function_sym (sym, sec, code_off);

> +}

> +

>  /* PR27584, Omit local and empty symbols since they usually generated

>     for pcrel relocations.  */

>

> @@ -5113,7 +5127,8 @@ static bool

>  riscv_elf_is_target_special_symbol (bfd *abfd, asymbol *sym)

>  {

>    return (!strcmp (sym->name, "")

> -	  || _bfd_elf_is_local_label_name (abfd, sym->name));

> +	  || _bfd_elf_is_local_label_name (abfd, sym->name)

> +	  || riscv_elf_is_mapping_symbols (sym->name));

>  }

>

>  #define TARGET_LITTLE_SYM			riscv_elfNN_vec

> @@ -5144,6 +5159,7 @@ riscv_elf_is_target_special_symbol (bfd *abfd, asymbol *sym)

>  #define elf_backend_grok_psinfo			riscv_elf_grok_psinfo

>  #define elf_backend_object_p			riscv_elf_object_p

>  #define elf_backend_write_core_note		riscv_write_core_note

> +#define elf_backend_maybe_function_sym		riscv_maybe_function_sym

>  #define elf_info_to_howto_rel			NULL

>  #define elf_info_to_howto			riscv_info_to_howto_rela

>  #define bfd_elfNN_bfd_relax_section		_bfd_riscv_relax_section

> diff --git a/binutils/testsuite/binutils-all/readelf.s b/binutils/testsuite/binutils-all/readelf.s

> index 6ae4dc756b9..99c2a3a2edd 100644

> --- a/binutils/testsuite/binutils-all/readelf.s

> +++ b/binutils/testsuite/binutils-all/readelf.s

> @@ -12,6 +12,7 @@ Section Headers:

>   +\[ .\] .* +PROGBITS +00000000 0000(3c|40|44|48|50) 0000(04|10) 00 +WA +0 +0 +(.|..)

>   +\[ .\] .* +NOBITS +00000000 0000(40|44|48|4c|60) 000000 00 +WA +0 +0 +(.|..)

>  # ARM targets put .ARM.attributes here

> +# RISC-V targets may put .riscv.attributes here

>  # MIPS targets put .reginfo, .mdebug, .MIPS.abiflags and .gnu.attributes here.

>  # v850 targets put .call_table_data and .call_table_text here.

>  #...

> diff --git a/binutils/testsuite/binutils-all/readelf.s-64 b/binutils/testsuite/binutils-all/readelf.s-64

> index 92ec05f0376..4d3125e6c2f 100644

> --- a/binutils/testsuite/binutils-all/readelf.s-64

> +++ b/binutils/testsuite/binutils-all/readelf.s-64

> @@ -14,11 +14,14 @@ Section Headers:

>   +\[ 4\] .bss +NOBITS +0000000000000000 +000000(4c|50|54|58|68)

>   +0000000000000000 +0000000000000000 +WA +0 +0 +.*

>  # x86 targets may put .note.gnu.property here.

> +# riscv targets may put .riscv.attributes here.

>  #...

>   +\[ .\] .symtab +SYMTAB +0000000000000000 +0+.*

>  # aarch64-elf targets have one more data symbol.

>  # x86 targets may have .note.gnu.property.

> - +0+.* +0000000000000018 +(6|7) +(3|4) +8

> +# riscv targets have two more symbols,

> +# one is mapping data symbol, the other is .riscv.attributes.

> + +0+.* +0000000000000018 +(6|7) +(3|4|5) +8

>   +\[ .\] .strtab +STRTAB +0000000000000000 +0+.*

>   +0+.* +0000000000000000 .* +0 +0 +1

>   +\[ .\] .shstrtab +STRTAB +0000000000000000 +[0-9a-f]+

> diff --git a/binutils/testsuite/binutils-all/readelf.s-64-unused b/binutils/testsuite/binutils-all/readelf.s-64-unused

> index a1e6cd1bbd8..5b638047c48 100644

> --- a/binutils/testsuite/binutils-all/readelf.s-64-unused

> +++ b/binutils/testsuite/binutils-all/readelf.s-64-unused

> @@ -14,11 +14,14 @@ Section Headers:

>   +\[ 4\] .bss +NOBITS +0000000000000000 +000000(4c|50|54|58)

>   +0000000000000000 +0000000000000000 +WA +0 +0 +.*

>  # x86 targets may put .note.gnu.property here.

> +# riscv targets may put .riscv.attributes here.


Did these ".riscv.attributes"-related test diffs sneak in from another 
commit?  I don't see anything here that would generate them, but maybe I 
missed it?

>  #...

>   +\[ .\] .symtab +SYMTAB +0000000000000000 +0+.*

>  # aarch64-elf targets have one more data symbol.

>  # x86 targets may have .note.gnu.property.

> - +0+.* +0000000000000018 +(6|7) +(6|7) +8

> +# riscv targets have two more symbols,

> +# one is mapping data symbol, the other is .riscv.attributes.

> + +0+.* +0000000000000018 +(6|7) +(6|7|8) +8

>   +\[ .\] .strtab +STRTAB +0000000000000000 +0+.*

>   +0+.* +0000000000000000 .* +0 +0 +1

>   +\[ .\] .shstrtab +STRTAB +0000000000000000 +[0-9a-f]+

> diff --git a/binutils/testsuite/binutils-all/readelf.ss b/binutils/testsuite/binutils-all/readelf.ss

> index 5fbb5d002e3..0d37c4def78 100644

> --- a/binutils/testsuite/binutils-all/readelf.ss

> +++ b/binutils/testsuite/binutils-all/readelf.ss

> @@ -5,10 +5,12 @@ Symbol table '.symtab' contains .* entries:

>   +1: 00000000 +0 +NOTYPE +LOCAL +DEFAULT +1 static_text_symbol

>  # ARM targets add the $d mapping symbol here...

>  # NDS32 targets add the $d2 mapping symbol here...

> +# RISC-V targets add the $d mapping symbol here...

>  #...

>   +.: 00000000 +0 +NOTYPE +LOCAL +DEFAULT +[34] static_data_symbol

>  # v850 targets include extra SECTION symbols here for the .call_table_data

>  # and .call_table_text sections.

> +# riscv targets may add the .riscv.attributes section symbol here...

>  #...

>   +[0-9]+: 00000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol

>   +[0-9]+: 00000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol

> diff --git a/binutils/testsuite/binutils-all/readelf.ss-64 b/binutils/testsuite/binutils-all/readelf.ss-64

> index 99a732f71f5..288c9f378ac 100644

> --- a/binutils/testsuite/binutils-all/readelf.ss-64

> +++ b/binutils/testsuite/binutils-all/readelf.ss-64

> @@ -4,12 +4,14 @@ Symbol table '.symtab' contains .* entries:

>   +0: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +UND

>   +1: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +1 static_text_symbol

>  # aarch64-elf targets add the $d mapping symbol here...

> +# riscv targets add the $d mapping symbol here...

>  #...

>   +.: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +3 static_data_symbol

> +# riscv targets may add the .riscv.attributes section symbol here...

>  # ... or here ...

>  #...

> -.* +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol

> - +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol

> - +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol

> +.* +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol

> + +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol

> + +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol

>   +[0-9]+: 0000000000000004 +4 +(COMMON|OBJECT) +GLOBAL +DEFAULT +COM common_symbol

>  #pass

> diff --git a/binutils/testsuite/binutils-all/readelf.ss-64-unused b/binutils/testsuite/binutils-all/readelf.ss-64-unused

> index f48a4b2bbd2..8e96d682dc3 100644

> --- a/binutils/testsuite/binutils-all/readelf.ss-64-unused

> +++ b/binutils/testsuite/binutils-all/readelf.ss-64-unused

> @@ -7,12 +7,14 @@ Symbol table '.symtab' contains .* entries:

>   +3: 0000000000000000 +0 +SECTION +LOCAL +DEFAULT +4.*

>   +4: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +1 static_text_symbol

>  # aarch64-elf targets add the $d mapping symbol here...

> +# riscv targets add the $d mapping symbol here...

>  #...

>   +.: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +3 static_data_symbol

> +# riscv targets may add the .riscv.attributes section symbol here...

>  # ... or here ...

>  #...

> -.* +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol

> - +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol

> - +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol

> +.* +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol

> + +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol

> + +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol

>   +[0-9]+: 0000000000000004 +4 +(COMMON|OBJECT) +GLOBAL +DEFAULT +COM common_symbol

>  #pass

> diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c

> index 42e57529369..fb52aa17e72 100644

> --- a/gas/config/tc-riscv.c

> +++ b/gas/config/tc-riscv.c

> @@ -553,6 +553,199 @@ static bool explicit_priv_attr = false;

>

>  static char *expr_end;

>

> +/* Create a new mapping symbol for the transition to STATE.  */

> +

> +static void

> +make_mapping_symbol (enum riscv_seg_mstate state,

> +		     valueT value,

> +		     fragS *frag)

> +{

> +  struct riscv_frag_mapping_symbol *last, *new;

> +  symbolS *symbol;

> +  const char *name;

> +  int type;

> +

> +  switch (state)

> +    {

> +    case MAP_DATA:

> +      name = "$d";

> +      type = BSF_NO_FLAGS;

> +      break;

> +    case MAP_INSN:

> +      name = "$x";

> +      type = BSF_NO_FLAGS;

> +      break;

> +    default:

> +      abort ();

> +    }

> +

> +  symbol = symbol_new (name, now_seg, frag, value);

> +  symbol_get_bfdsym (symbol)->flags |= type | BSF_LOCAL;

> +

> +  last = frag->tc_frag_data.map_syms;

> +  while (last && last->next != NULL)

> +    last = last->next;

> +  if (last != NULL

> +      && last->symbol != NULL)

> +    {

> +      /* The mapping symbols should be added in offset order.  */

> +      know (S_GET_VALUE (last->symbol) <= S_GET_VALUE (symbol));

> +

> +      /* Replace the last mapping symbol with the new one at

> +	 the same offset.  */

> +      if (S_GET_VALUE (last->symbol) == S_GET_VALUE (symbol))

> +	{

> +	  symbol_remove (last->symbol, &symbol_rootP, &symbol_lastP);

> +	  last->state = state;

> +	  last->symbol = symbol;

> +	  return;

> +	}

> +    }

> +

> +  new = XNEW (struct riscv_frag_mapping_symbol);

> +  new->state = state;

> +  new->symbol = symbol;

> +  new->next = NULL;

> +  if (frag->tc_frag_data.map_syms == NULL)

> +    frag->tc_frag_data.map_syms = new;

> +  else

> +    last->next = new;

> +}

> +

> +/* Set the mapping state for frag_now.  */

> +

> +void

> +riscv_mapping_state (enum riscv_seg_mstate to_state,

> +		     int max_chars)

> +{

> +  enum riscv_seg_mstate from_state =

> +	seg_info (now_seg)->tc_segment_info_data.map_state;

> +

> +  /* We only add mapping symbols to the normal section.  */

> +  if (!SEG_NORMAL (now_seg))

> +    return;

> +

> +  /* The mapping symbol should be emitted.  */

> +  if (from_state == to_state)

> +    return;

> +

> +  /* This should be a data section, don't do anything if we still generate

> +     data in this section.  */

> +  if (from_state == MAP_NONE

> +      && to_state == MAP_DATA

> +      && !subseg_text_p (now_seg))

> +    return;

> +

> +  /* If the instruction isn't generated at the start of the section, then

> +     we need to add the MAP_DATA at the beginnig.  */

> +  if (from_state == MAP_NONE

> +      && to_state == MAP_INSN)

> +    {

> +      struct frag *const frag_first = seg_info (now_seg)->frchainP->frch_root;

> +      if (frag_now != frag_first || frag_now_fix () > 0)

> +	make_mapping_symbol (MAP_DATA, (valueT) 0, frag_first);

> +    }

> +

> +  seg_info (now_seg)->tc_segment_info_data.map_state = to_state;

> +  make_mapping_symbol (to_state, (valueT) frag_now_fix () - max_chars, frag_now);

> +}

> +

> +/* Add the odd bytes of paddings for riscv_handle_align.  */

> +

> +static void

> +riscv_add_odd_padding_symbol (fragS *frag)

> +{

> +  /* The MAP_DATA may be added redundantly, if the previous mapping

> +     symbol is also MAP_DATA.  The riscv_check_mapping_symbols will

> +     help to remove it.  */

> +  make_mapping_symbol (MAP_DATA, frag->fr_fix, frag);

> +  make_mapping_symbol (MAP_INSN, frag->fr_fix + 1, frag);

> +}

> +

> +/* Remove all overlapped and redundant mapping symbols.  */

> +

> +static void

> +riscv_check_mapping_symbols (bfd *abfd ATTRIBUTE_UNUSED,

> +			     asection *sec,

> +			     void *dummy ATTRIBUTE_UNUSED)

> +{

> +  segment_info_type *seginfo = seg_info (sec);

> +  fragS *fragp;

> +  enum riscv_seg_mstate state = MAP_NONE;

> +

> +  if (seginfo == NULL || seginfo->frchainP == NULL)

> +    return;

> +

> +  for (fragp = seginfo->frchainP->frch_root;

> +       fragp != NULL;

> +       fragp = fragp->fr_next)

> +    {

> +      struct riscv_frag_mapping_symbol *last, *temp;

> +      temp = fragp->tc_frag_data.map_syms;

> +      last = temp;

> +      for (; temp != NULL; temp = temp->next)

> +	{

> +	  /* The mapping state is same as previous one, so remove it.  */

> +	  if (state == temp->state)

> +	    {

> +	      symbol_remove (temp->symbol, &symbol_rootP, &symbol_lastP);

> +	      temp->symbol = NULL;

> +	      continue;

> +	    }

> +	  state = temp->state;

> +	  last = temp;

> +	}

> +

> +      if (last == NULL || last->symbol == NULL)

> +	continue;

> +

> +      /* Check the last mapping symbol if it is at the boundary of

> +	 fragmnet.  */

> +      fragS *next = fragp->fr_next;

> +      if (S_GET_VALUE (last->symbol) < next->fr_address)

> +	continue;

> +      know (S_GET_VALUE (last->symbol) == next->fr_address);

> +

> +      /* Since we may have empty frags without any mapping symbols,

> +	 keep looking until the non-empty frag.  */

> +      for (; next != NULL; next = next->fr_next)

> +	{

> +	  temp = next->tc_frag_data.map_syms;

> +	  if (temp != NULL

> +	      && temp->symbol != NULL

> +	      && (S_GET_VALUE (temp->symbol) == next->fr_address))

> +	    {

> +	      /* The last mapping symbol overlaps with another one

> +		 which at the start of the next frag.  */

> +	      symbol_remove (last->symbol, &symbol_rootP, &symbol_lastP);

> +	      last->symbol = NULL;

> +	      break;

> +	    }

> +

> +	  if (next->fr_next == NULL

> +	      && next->fr_fix == 0 && next->fr_var == 0)

> +	    {

> +	      /* The last mapping symbol is at the end of the section.  */

> +	      symbol_remove (last->symbol, &symbol_rootP, &symbol_lastP);

> +	      last->symbol = NULL;

> +	      break;

> +	    }

> +

> +	  if (next->fr_address != next->fr_next->fr_address)

> +	    break;

> +	}

> +

> +      /* This frag won't be checked, so free the unused information.  */

> +      temp = fragp->tc_frag_data.map_syms;

> +      while (temp != NULL)

> +	{

> +	  last = temp;

> +	  temp = temp->next;

> +	  free (last);

> +	}

> +    }

> +}

> +

>  /* The default target format to use.  */

>

>  const char *

> @@ -2759,6 +2952,8 @@ md_assemble (char *str)

>         return;

>      }

>

> +  riscv_mapping_state (MAP_INSN, 0);

> +

>    const char *error = riscv_ip (str, &insn, &imm_expr, &imm_reloc, op_hash);

>

>    if (error)

> @@ -3424,6 +3619,8 @@ riscv_frag_align_code (int n)

>    fix_new_exp (frag_now, nops - frag_now->fr_literal, 0,

>  	       &ex, false, BFD_RELOC_RISCV_ALIGN);

>

> +  riscv_mapping_state (MAP_INSN, worst_case_bytes);

> +

>    return true;

>  }

>

> @@ -3443,6 +3640,7 @@ riscv_handle_align (fragS *fragP)

>  	  /* We have 4 byte uncompressed nops.  */

>  	  bfd_signed_vma size = 4;

>  	  bfd_signed_vma excess = bytes % size;

> +	  bfd_boolean odd_padding = (excess % 2 == 1);

>  	  char *p = fragP->fr_literal + fragP->fr_fix;

>

>  	  if (bytes <= 0)

> @@ -3451,12 +3649,20 @@ riscv_handle_align (fragS *fragP)

>  	  /* Insert zeros or compressed nops to get 4 byte alignment.  */

>  	  if (excess)

>  	    {

> +	      if (odd_padding)

> +		riscv_add_odd_padding_symbol (fragP);

>  	      riscv_make_nops (p, excess);

>  	      fragP->fr_fix += excess;

>  	      p += excess;

>  	    }

>

> -	  /* Insert variable number of 4 byte uncompressed nops.  */

> +	  /* The frag will be changed to `rs_fill` later.  The function

> +	     `write_contents` will try to fill the remaining spaces

> +	     according to the patterns we give.  In this case, we give

> +	     a 4 byte uncompressed nop as the pattern, and set the size

> +	     of the pattern into `fr_var`.  The nop will be output to the

> +	     file `fr_offset` times.  However, `fr_offset` could be zero

> +	     if we don't need to pad the boundary finally.  */

>  	  riscv_make_nops (p, size);

>  	  fragP->fr_var = size;

>  	}

> @@ -3467,6 +3673,30 @@ riscv_handle_align (fragS *fragP)

>      }

>  }

>

> +/* This usually called from frag_var.  */

> +

> +void

> +riscv_init_frag (fragS * fragP, int max_chars)

> +{

> +  /* Do not add mapping symbol to debug sections.  */

> +  if (bfd_section_flags (now_seg) & SEC_DEBUGGING)

> +    return;

> +

> +  switch (fragP->fr_type)

> +    {

> +    case rs_fill:

> +    case rs_align:

> +    case rs_align_test:

> +      riscv_mapping_state (MAP_DATA, max_chars);

> +      break;

> +    case rs_align_code:

> +      riscv_mapping_state (MAP_INSN, max_chars);

> +      break;

> +    default:

> +      break;

> +    }

> +}

> +

>  int

>  md_estimate_size_before_relax (fragS *fragp, asection *segtype)

>  {

> @@ -3723,6 +3953,8 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)

>    save_c = *input_line_pointer;

>    *input_line_pointer = '\0';

>

> +  riscv_mapping_state (MAP_INSN, 0);

> +

>    const char *error = riscv_ip (str, &insn, &imm_expr,

>  				&imm_reloc, insn_type_hash);

>

> @@ -3813,6 +4045,15 @@ riscv_md_end (void)

>    riscv_set_public_attributes ();

>  }

>

> +/* Adjust the symbol table.  */

> +

> +void

> +riscv_adjust_symtab (void)

> +{

> +  bfd_map_over_sections (stdoutput, riscv_check_mapping_symbols, (char *) 0);

> +  elf_adjust_symtab ();

> +}

> +

>  /* Given a symbolic attribute NAME, return the proper integer value.

>     Returns -1 if the attribute is not known.  */

>

> diff --git a/gas/config/tc-riscv.h b/gas/config/tc-riscv.h

> index 1de138458d8..c7dbf215537 100644

> --- a/gas/config/tc-riscv.h

> +++ b/gas/config/tc-riscv.h

> @@ -128,4 +128,34 @@ extern void riscv_elf_final_processing (void);

>  extern void riscv_md_end (void);

>  extern int riscv_convert_symbolic_attribute (const char *);

>

> +/* Set mapping symbol states.  */

> +#define md_cons_align(nbytes) riscv_mapping_state (MAP_DATA, 0)

> +void riscv_mapping_state (enum riscv_seg_mstate, int);

> +

> +#define TC_SEGMENT_INFO_TYPE struct riscv_segment_info_type

> +struct riscv_segment_info_type

> +{

> +  enum riscv_seg_mstate map_state;

> +};

> +

> +struct riscv_frag_mapping_symbol

> +{

> +  enum riscv_seg_mstate state;

> +  symbolS *symbol;

> +  struct riscv_frag_mapping_symbol *next;

> +};

> +

> +/* Define target fragment type.  */

> +#define TC_FRAG_TYPE struct riscv_frag_type

> +struct riscv_frag_type

> +{

> +  struct riscv_frag_mapping_symbol *map_syms;

> +};

> +

> +#define TC_FRAG_INIT(fragp, max_bytes) riscv_init_frag (fragp, max_bytes)

> +extern void riscv_init_frag (struct frag *, int);

> +

> +#define obj_adjust_symtab() riscv_adjust_symtab ()

> +extern void riscv_adjust_symtab (void);

> +

>  #endif /* TC_RISCV */

> diff --git a/gas/testsuite/gas/riscv/mapping-01.s b/gas/testsuite/gas/riscv/mapping-01.s

> new file mode 100644

> index 00000000000..0a270454b1b

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-01.s

> @@ -0,0 +1,18 @@

> +	.option norvc

> +	.text

> +	.global	funcA

> +funcA:

> +	nop

> +	j	funcB

> +

> +	.global	funcB

> +funcB:

> +	nop

> +	bne	a0, a1, funcB

> +

> +	.data

> +	.word 0x123456

> +

> +	.section	.foo, "ax"

> +foo:

> +	nop

> diff --git a/gas/testsuite/gas/riscv/mapping-01a.d b/gas/testsuite/gas/riscv/mapping-01a.d

> new file mode 100644

> index 00000000000..5cd036afbeb

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-01a.d

> @@ -0,0 +1,18 @@

> +#as:

> +#source: mapping-01.s

> +#objdump: --syms --special-syms

> +

> +.*file format.*riscv.*

> +

> +SYMBOL TABLE:

> +0+00 l    d  .text	0+00 .text

> +0+00 l    d  .data	0+00 .data

> +0+00 l    d  .bss	0+00 .bss

> +0+00 l       .text	0+00 \$x

> +0+00 l    d  .foo	0+00 .foo

> +0+00 l       .foo	0+00 foo

> +0+00 l       .foo	0+00 \$x

> +# Maybe section symbol for .riscv.attributes

> +#...

> +0+00 g       .text	0+00 funcA

> +0+08 g       .text	0+00 funcB

> diff --git a/gas/testsuite/gas/riscv/mapping-01b.d b/gas/testsuite/gas/riscv/mapping-01b.d

> new file mode 100644

> index 00000000000..723bd5951cb

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-01b.d

> @@ -0,0 +1,21 @@

> +#as:

> +#source: mapping-01.s

> +#objdump: -d

> +

> +.*:[ 	]+file format .*

> +

> +

> +Disassembly of section .text:

> +

> +0+000 <funcA>:

> +[ 	]+0:[ 	]+00000013[ 	]+nop

> +[ 	]+4:[ 	]+0040006f[ 	]+j[ 	]+8 <funcB>

> +

> +0+008 <funcB>:

> +[ 	]+8:[ 	]+00000013[ 	]+nop

> +[ 	]+c:[ 	]+feb51ee3[ 	]+bne[ 	]+a0,a1,8 <funcB>

> +

> +Disassembly of section .foo:

> +

> +0+000 <foo>:

> +[ 	]+0:[ 	]+00000013[ 	]+nop

> diff --git a/gas/testsuite/gas/riscv/mapping-02.s b/gas/testsuite/gas/riscv/mapping-02.s

> new file mode 100644

> index 00000000000..d3de6dedac8

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-02.s

> @@ -0,0 +1,13 @@

> +	.option norvc

> +	.text

> +	.global main

> +	.type   main, %function

> +main:

> +	nop

> +foo:

> +	nop

> +	nop

> +	.size   main, .-main

> +	.ident  ""

> +

> +	nop

> diff --git a/gas/testsuite/gas/riscv/mapping-02a.d b/gas/testsuite/gas/riscv/mapping-02a.d

> new file mode 100644

> index 00000000000..a60311cb0c1

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-02a.d

> @@ -0,0 +1,16 @@

> +#as:

> +#source: mapping-02.s

> +#objdump: --syms --special-syms

> +

> +.*file format.*riscv.*

> +

> +SYMBOL TABLE:

> +0+00 l    d  .text	0+00 .text

> +0+00 l    d  .data	0+00 .data

> +0+00 l    d  .bss	0+00 .bss

> +0+00 l       .text	0+00 \$x

> +0+04 l       .text	0+00 foo

> +0+00 l    d  .comment	0+00 .comment

> +# Maybe section symbol for .riscv.attributes

> +#...

> +0+00 g     F .text	0+0c main

> diff --git a/gas/testsuite/gas/riscv/mapping-02b.d b/gas/testsuite/gas/riscv/mapping-02b.d

> new file mode 100644

> index 00000000000..3e875600a9c

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-02b.d

> @@ -0,0 +1,16 @@

> +#as:

> +#source: mapping-02.s

> +#objdump: -d

> +

> +.*:[ 	]+file format .*

> +

> +

> +Disassembly of section .text:

> +

> +0+000 <main>:

> +[ 	]+0:[ 	]+00000013[ 	]+nop

> +

> +0+004 <foo>:

> +[ 	]+4:[ 	]+00000013[ 	]+nop

> +[ 	]+8:[ 	]+00000013[ 	]+nop

> +[ 	]+c:[ 	]+00000013[ 	]+nop

> diff --git a/gas/testsuite/gas/riscv/mapping-03.s b/gas/testsuite/gas/riscv/mapping-03.s

> new file mode 100644

> index 00000000000..49964f20288

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-03.s

> @@ -0,0 +1,10 @@

> +	.option norvc

> +	.text

> +	.word 0

> +	nop

> +	.data

> +	.word 0

> +	.text

> +	nop

> +	.short 0

> +	nop

> diff --git a/gas/testsuite/gas/riscv/mapping-03a.d b/gas/testsuite/gas/riscv/mapping-03a.d

> new file mode 100644

> index 00000000000..563db73994a

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-03a.d

> @@ -0,0 +1,16 @@

> +#as:

> +#source: mapping-03.s

> +#objdump: --syms --special-syms

> +

> +.*file format.*riscv.*

> +

> +SYMBOL TABLE:

> +0+00 l    d  .text	0+00 .text

> +0+00 l    d  .data	0+00 .data

> +0+00 l    d  .bss	0+00 .bss

> +0+00 l       .text	0+00 \$d

> +0+04 l       .text	0+00 \$x

> +0+0c l       .text	0+00 \$d

> +0+0e l       .text	0+00 \$x

> +# Maybe section symbol for .riscv.attributes

> +#...

> diff --git a/gas/testsuite/gas/riscv/mapping-03b.d b/gas/testsuite/gas/riscv/mapping-03b.d

> new file mode 100644

> index 00000000000..ed87fcf33d2

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-03b.d

> @@ -0,0 +1,16 @@

> +#as:

> +#source: mapping-03.s

> +#objdump: -d

> +

> +.*:[ 	]+file format .*

> +

> +

> +Disassembly of section .text:

> +

> +0+000 <.text>:

> +[ 	]+0:[ 	]+00000000[ 	]+.word[ 	]+0x00000000

> +[ 	]+4:[ 	]+00000013[ 	]+nop

> +[ 	]+8:[ 	]+00000013[ 	]+nop

> +[ 	]+c:[ 	]+0000[ 	]+.short[ 	]+0x0000

> +[ 	]+e:[ 	]+00000013[ 	]+nop

> +#...

> diff --git a/gas/testsuite/gas/riscv/mapping-04.s b/gas/testsuite/gas/riscv/mapping-04.s

> new file mode 100644

> index 00000000000..36bfac11050

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-04.s

> @@ -0,0 +1,11 @@

> +	.option norvc

> +	.text

> +	nop

> +	.long	0

> +	.align	4

> +	.word	0x12345678

> +	nop

> +	.byte	1

> +	.long	0

> +	.align	4

> +	.word	0x12345678

> diff --git a/gas/testsuite/gas/riscv/mapping-04a.d b/gas/testsuite/gas/riscv/mapping-04a.d

> new file mode 100644

> index 00000000000..048518beb9a

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-04a.d

> @@ -0,0 +1,21 @@

> +#as:

> +#source: mapping-04.s

> +#objdump: --syms --special-syms

> +

> +.*file format.*riscv.*

> +

> +SYMBOL TABLE:

> +0+00 l    d  .text	0+00 .text

> +0+00 l    d  .data	0+00 .data

> +0+00 l    d  .bss	0+00 .bss

> +0+00 l       .text	0+00 \$x

> +0+04 l       .text	0+00 \$d

> +0+08 l       .text	0+00 \$x

> +0+14 l       .text	0+00 \$d

> +0+18 l       .text	0+00 \$x

> +0+1c l       .text	0+00 \$d

> +0+21 l       .text	0+00 \$x

> +0+2d l       .text	0+00 \$d

> +0+31 l       .text	0+00 \$x

> +# Maybe section symbol for .riscv.attributes

> +#...

> diff --git a/gas/testsuite/gas/riscv/mapping-04b.d b/gas/testsuite/gas/riscv/mapping-04b.d

> new file mode 100644

> index 00000000000..64d9f558753

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-04b.d

> @@ -0,0 +1,24 @@

> +#as:

> +#source: mapping-04.s

> +#objdump: -d

> +

> +.*:[ 	]+file format .*

> +

> +

> +Disassembly of section .text:

> +

> +0+000 <.text>:

> +[ 	]+0:[ 	]+00000013[ 	]+nop

> +[ 	]+4:[ 	]+00000000[ 	]+.word[ 	]+0x00000000

> +[ 	]+8:[ 	]+00000013[ 	]+nop

> +[ 	]+c:[ 	]+00000013[ 	]+nop

> +[ 	]+10:[ 	]+00000013[ 	]+nop

> +[ 	]+14:[ 	]+12345678[ 	]+.word[ 	]+0x12345678

> +[ 	]+18:[ 	]+00000013[ 	]+nop

> +[ 	]+1c:[ 	]+00000001[ 	]+.word[ 	]+0x00000001

> +[ 	]+20:[ 	]+00[ 	]+.byte[ 	]+0x00

> +[ 	]+21:[ 	]+00000013[ 	]+nop

> +[ 	]+25:[ 	]+00000013[ 	]+nop

> +[ 	]+29:[ 	]+00000013[ 	]+nop

> +[ 	]+2d:[ 	]+12345678[ 	]+.word[ 	]+0x12345678

> +#...

> diff --git a/gas/testsuite/gas/riscv/mapping-norelax-04a.d b/gas/testsuite/gas/riscv/mapping-norelax-04a.d

> new file mode 100644

> index 00000000000..0792068890e

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-norelax-04a.d

> @@ -0,0 +1,21 @@

> +#as: -mno-relax

> +#source: mapping-04.s

> +#objdump: --syms --special-syms

> +

> +.*file format.*riscv.*

> +

> +SYMBOL TABLE:

> +0+00 l    d  .text	0+00 .text

> +0+00 l    d  .data	0+00 .data

> +0+00 l    d  .bss	0+00 .bss

> +0+00 l       .text	0+00 \$x

> +0+04 l       .text	0+00 \$d

> +0+08 l       .text	0+00 \$x

> +0+10 l       .text	0+00 \$d

> +0+14 l       .text	0+00 \$x

> +0+18 l       .text	0+00 \$d

> +0+20 l       .text	0+00 \$d

> +0+24 l       .text	0+00 \$x

> +0+1e l       .text	0+00 \$x

> +# Maybe section symbol for .riscv.attributes

> +#...

> diff --git a/gas/testsuite/gas/riscv/mapping-norelax-04b.d b/gas/testsuite/gas/riscv/mapping-norelax-04b.d

> new file mode 100644

> index 00000000000..71484bbf4c3

> --- /dev/null

> +++ b/gas/testsuite/gas/riscv/mapping-norelax-04b.d

> @@ -0,0 +1,24 @@

> +#as: -mno-relax

> +#source: mapping-04.s

> +#objdump: -d

> +

> +.*:[ 	]+file format .*

> +

> +

> +Disassembly of section .text:

> +

> +0+000 <.text>:

> +[ 	]+0:[ 	]+00000013[ 	]+nop

> +[ 	]+4:[ 	]+00000000[ 	]+.word[ 	]+0x00000000

> +[ 	]+8:[ 	]+00000013[ 	]+nop

> +[ 	]+c:[ 	]+00000013[ 	]+nop

> +[ 	]+10:[ 	]+12345678[ 	]+.word[ 	]+0x12345678

> +[ 	]+14:[ 	]+00000013[ 	]+nop

> +[ 	]+18:[ 	]+00000001[ 	]+.word[ 	]+0x00000001

> +[ 	]+1c:[ 	]+0000[ 	]+.short[ 	]+0x0000

> +[ 	]+1e:[ 	]+0001[ 	]+nop

> +[ 	]+20:[ 	]+12345678[ 	]+.word[ 	]+0x12345678

> +[ 	]+24:[ 	]+00000013[ 	]+nop

> +[ 	]+28:[ 	]+00000013[ 	]+nop

> +[ 	]+2c:[ 	]+00000013[ 	]+nop

> +#...

> diff --git a/gas/testsuite/gas/riscv/no-relax-align-2.d b/gas/testsuite/gas/riscv/no-relax-align-2.d

> index 7407b495a8f..0c7280558f0 100644

> --- a/gas/testsuite/gas/riscv/no-relax-align-2.d

> +++ b/gas/testsuite/gas/riscv/no-relax-align-2.d

> @@ -7,7 +7,7 @@

>  Disassembly of section .text:

>

>  0+000 <.text>:

> -[ 	]+0:[ 	]+0000[ 	]+unimp

> +[ 	]+0:[ 	]+0000[ 	]+.short[ 	]+0x0000

>  [ 	]+2:[ 	]+0001[ 	]+nop

>  [ 	]+4:[ 	]+00000013[ 	]+nop

>  [ 	]+8:[ 	]+00000013[ 	]+nop

> diff --git a/include/opcode/riscv.h b/include/opcode/riscv.h

> index fdf3df4f5c1..5291ab7e4e7 100644

> --- a/include/opcode/riscv.h

> +++ b/include/opcode/riscv.h

> @@ -425,6 +425,13 @@ enum

>    M_NUM_MACROS

>  };

>

> +/* The mapping symbol states.  */

> +enum riscv_seg_mstate

> +{

> +  MAP_NONE = 0, /* Must be zero, for seginfo in new sections.  */

> +  MAP_DATA,

> +  MAP_INSN,

> +};

>

>  extern const char * const riscv_gpr_names_numeric[NGPR];

>  extern const char * const riscv_gpr_names_abi[NGPR];

> diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c

> index fe8dfb88d90..89aa6e59244 100644

> --- a/opcodes/riscv-dis.c

> +++ b/opcodes/riscv-dis.c

> @@ -41,6 +41,10 @@ struct riscv_private_data

>    bfd_vma hi_addr[OP_MASK_RD + 1];

>  };

>

> +static int last_map_symbol = -1;

> +static bfd_vma last_stop_offset = 0;

> +enum riscv_seg_mstate last_map_state;

> +

>  static const char * const *riscv_gpr_names;

>  static const char * const *riscv_fpr_names;

>

> @@ -556,13 +560,189 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)

>    return insnlen;

>  }

>

> +static bool

> +riscv_get_map_state (int n,

> +		     enum riscv_seg_mstate *state,

> +		     struct disassemble_info *info)

> +{

> +  const char *name;

> +

> +  /* If the symbol is in a different section, ignore it.  */

> +  if (info->section != NULL

> +      && info->section != info->symtab[n]->section)

> +    return false;

> +

> +  name = bfd_asymbol_name(info->symtab[n]);

> +  if (strcmp (name, "$x") == 0)

> +    *state = MAP_INSN;

> +  else if (strcmp (name, "$d") == 0)

> +    *state = MAP_DATA;

> +  else

> +    return false;

> +

> +  return true;

> +}

> +

> +/* Check the sorted symbol table (sorted by the symbol value), find the

> +   suitable mapping symbols.  */

> +

> +static enum riscv_seg_mstate

> +riscv_search_mapping_symbol (bfd_vma memaddr,

> +			     struct disassemble_info *info)

> +{

> +  enum riscv_seg_mstate mstate;

> +  bool from_last_map_symbol;

> +  bool found = false;

> +  int symbol = -1;

> +  int n;

> +

> +  /* Decide whether to print the data or instruction by default, in case

> +     we can not find the corresponding mapping symbols.  */

> +  mstate = MAP_DATA;

> +  if ((info->section

> +       && info->section->flags & SEC_CODE)

> +      || !info->section)

> +    mstate = MAP_INSN;

> +

> +  /* Return default mapping state if there are no suitable symbols.  */

> +  if (info->symtab_size == 0

> +      || bfd_asymbol_flavour (*info->symtab) != bfd_target_elf_flavour)

> +    return mstate;

> +

> +  /* Reset the last_map_symbol if we start to dump a new section.  */

> +  if (memaddr <= 0)

> +    last_map_symbol = -1;

> +

> +  /* If the last stop offset is different from the current one, then

> +     don't use the last_map_symbol to search.  We usually reset the

> +     info->stop_offset when handling a new section.  */

> +  from_last_map_symbol = (last_map_symbol >= 0

> +			  && info->stop_offset == last_stop_offset);

> +

> +  /* Start scanning at the info->symtab_pos or the last_map_symbol.

> +     Try to find the suitable mapping symbol until the current pc.  */

> +  n = info->symtab_pos + 1;

> +  if (from_last_map_symbol && n >= last_map_symbol)

> +    n = last_map_symbol;

> +

> +  /* Find the suitable mapping symbol to dump.  */

> +  for (; n < info->symtab_size; n++)

> +    {

> +      bfd_vma addr = bfd_asymbol_value (info->symtab[n]);

> +      /* We have searched all possible symbols in the range.  */

> +      if (addr > memaddr)

> +	break;

> +      /* Do not stop searching, in case there are some mapping

> +        symbols have the same value, but have different names.

> +        Use the last one.  */

> +      if (riscv_get_map_state (n, &mstate, info))

> +	{

> +	  symbol = n;

> +	  found = true;

> +	}

> +    }

> +

> +  /* We can not find the suitable mapping symbol above.  Therefore, we

> +     look forwards and try to find it again, but don't go pass the start

> +     of the section.  Otherwise a data section without mapping symbols

> +     can pick up a text mapping symbol of a preceeding section.  */

> +  if (!found)

> +    {

> +      n = info->symtab_pos;

> +      if (from_last_map_symbol && n >= last_map_symbol)

> +      n = last_map_symbol;

> +

> +      for (; n >= 0; n--)

> +	{

> +	  bfd_vma addr = bfd_asymbol_value (info->symtab[n]);

> +	  /* We have searched all possible symbols in the range.  */

> +	  if (addr < (info->section ? info->section->vma : 0))

> +	    break;

> +	  /* Stop searching once we find the closed mapping symbol.  */

> +	  if (riscv_get_map_state (n, &mstate, info))

> +	    {

> +	      symbol = n;

> +	      found = true;

> +	      break;

> +	    }

> +	}

> +    }

> +

> +  /* Save the information for next use.  */

> +  last_map_symbol = symbol;

> +  last_stop_offset = info->stop_offset;

> +

> +  return mstate;

> +}

> +

> +/* Decide which data size we should print.  */

> +

> +static bfd_vma

> +riscv_data_length (bfd_vma memaddr, struct disassemble_info *info)

> +{

> +  bfd_vma size = 4;

> +  int n;

> +

> +  /* Return default mapping state if there are no suitable symbols.  */

> +  if (info->symtab_size == 0

> +      || bfd_asymbol_flavour (*info->symtab) != bfd_target_elf_flavour

> +      || last_map_symbol < 0)

> +    return size;

> +

> +  for (n = last_map_symbol + 1; n < info->symtab_size; n++)

> +    {

> +      bfd_vma addr = bfd_asymbol_value (info->symtab[n]);

> +      if (addr > memaddr)

> +	{

> +	  if (addr - memaddr < size)

> +	    size = addr - memaddr;

> +	  break;

> +	}

> +    }

> +  size = size == 3 ? 2 : size;


I'm not quite sure how this works, but would this result in something 
like

    struct threebyte {
        char a, b, c;
    };
    struct threebyte x;

being interpreted as a 2-byte symbol and thus dropping the last byte 
from the disassembly?

> +

> +  return size;

> +}

> +

> +/* Dump the data contents.  */

> +

> +static int

> +riscv_disassemble_data (bfd_vma memaddr ATTRIBUTE_UNUSED,

> +			insn_t data,

> +			disassemble_info *info)

> +{

> +  info->display_endian = info->endian;

> +

> +  switch (info->bytes_per_chunk)

> +    {

> +    case 1:

> +      (*info->fprintf_func) (info->stream, ".byte\t0x%02llx",

> +			     (unsigned long long) data);

> +      break;

> +    case 2:

> +      (*info->fprintf_func) (info->stream, ".short\t0x%04llx",

> +			     (unsigned long long) data);

> +      break;

> +    case 4:

> +      (*info->fprintf_func) (info->stream, ".word\t0x%08llx",

> +			     (unsigned long long) data);

> +      break;

> +    default:

> +      abort ();


We only have 2-byte and 4-byte instructions, but now that we have 
disassembly for data segments can we have 8-byte chunks from .long?  
Either way, it's probably best to have a bug message here rather than 
just an abort() -- IIUC these are coming from the ELF, so even if we 
can't generate them we shouldn't abort() on them.

> +    }

> +  return info->bytes_per_chunk;

> +}

> +

>  int

>  print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)

>  {

> -  bfd_byte packet[2];

> +  bfd_byte packet[8];

>    insn_t insn = 0;

> -  bfd_vma n;

> +  bfd_vma dump_size;

>    int status;

> +  enum riscv_seg_mstate mstate;

> +  int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);

> +

>

>    if (info->disassembler_options != NULL)

>      {

> @@ -573,23 +753,42 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)

>    else if (riscv_gpr_names == NULL)

>      set_default_riscv_dis_options ();

>

> -  /* Instructions are a sequence of 2-byte packets in little-endian order.  */

> -  for (n = 0; n < sizeof (insn) && n < riscv_insn_length (insn); n += 2)

> +  mstate = riscv_search_mapping_symbol (memaddr, info);

> +  /* Save the last mapping state.  */

> +  last_map_state = mstate;

> +

> +  /* Set the size to dump.  */

> +  if (mstate == MAP_DATA &&

> +      (info->flags & DISASSEMBLE_DATA) == 0)

> +    {

> +      dump_size = riscv_data_length (memaddr, info);

> +      info->bytes_per_chunk = dump_size;

> +      riscv_disassembler = riscv_disassemble_data;

> +    }

> +  else

>      {

> -      status = (*info->read_memory_func) (memaddr + n, packet, 2, info);

> +      /* Get the first 2-bytes to check the lenghth of instruction.  */

> +      status = (*info->read_memory_func) (memaddr, packet, 2, info);

>        if (status != 0)

>  	{

> -	  /* Don't fail just because we fell off the end.  */

> -	  if (n > 0)

> -	    break;

>  	  (*info->memory_error_func) (status, memaddr, info);

> -	  return status;

> +	  return 1;

>  	}

> +      insn = (insn_t) bfd_getl16 (packet);

> +      dump_size = riscv_insn_length (insn);

> +      riscv_disassembler = riscv_disassemble_insn;

> +    }

>

> -      insn |= ((insn_t) bfd_getl16 (packet)) << (8 * n);

> +  /* Fetch the instruction to dump.  */

> +  status = (*info->read_memory_func) (memaddr, packet, dump_size, info);

> +  if (status != 0)

> +    {

> +      (*info->memory_error_func) (status, memaddr, info);

> +      return 1;

>      }

> +  insn = (insn_t) bfd_get_bits (packet, dump_size * 8, false);

>

> -  return riscv_disassemble_insn (memaddr, insn, info);

> +  return (*riscv_disassembler) (memaddr, insn, info);

>  }

>

>  disassembler_ftype

> @@ -631,7 +830,8 @@ riscv_symbol_is_valid (asymbol * sym,

>

>    name = bfd_asymbol_name (sym);

>

> -  return (strcmp (name, RISCV_FAKE_LABEL_NAME) != 0);

> +  return (strcmp (name, RISCV_FAKE_LABEL_NAME) != 0

> +	  && !riscv_elf_is_mapping_symbols (name));

>  }

>

>  void
Nelson Chu June 24, 2021, 2:06 a.m. | #2
Hi Palmer,

Sorry for the late reply, thanks for your suggestion :)

On Thu, Jun 3, 2021 at 12:18 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>

> On Fri, 28 May 2021 19:18:18 PDT (-0700), nelson.chu@sifive.com wrote:

> > Similar to ARM/AARCH64, we add mapping symbols in the symbol table,

> > to mark the start addresses of data and instructions.  The $d means data,

> > and the $x means instruction.  Then the disassembler uses these symbols

>

> We picked "__global_pointer$" under the assumption "$" would avoid any

> conflicts with symbols from user code, so these should be safe too.

> IIRC that's a normal way to do these things, and with Arm already having

> those names that seems like a good way to go.


Thanks for agreeing this.

> > TODOS,

> > * Need to add new directive, .inst 0x00000013, to show this is encoded as an

> > instruction, rather than a data.

>

> Maybe just an add explicit 4-byte NOP pseudo instruction or alias?  I

> keep finding myself in the position where I want to insert exactly 4

> bytes of NOP and the automatic compression gets in the way.


Oh, it may not be good to give the nop as example.  We have .insn
directive to let users can use their own instructions, or some new
instruction, which haven't supported in the old binutils.  For
example, if users want to use sifive cache instruction, they cannot
just write "cflush.d1.l1" and complied by the fsf binutils, but they
can write ".insn i SYSTEM, 0, x0, x10, -0x40".  But as you can see,
the .insn directive may not easy to use for some cases.  Therefore, I
believe most of the users are using ".word 0xfc050073" rather than the
.insn.  But if they always use ".word xxxxx" to encode their
instructions, then the mapping symbols will treat these .word as data
rather than instruction.  I see that arm defines .inst/.inst.w/.inst.n
to resolve the problem, so maybe we can follow this way.

> > * Consider RVC to NORVC, and DATA to INSN, we probably need to add the

> > alignments automatically by assembler.  But in fact the things that need

> > to be dealt with are more complicated than imagined.

>

> We've been through this a bunch of times and it always ends up somewhat

> hairy.  There's sort of this balance between trying to avoid inserting

> non-executable instructions -- either a single-byte NOP (for re-aligning

> on bytes) or a two-byte NOP outside of C (for re-aligning when turning

> off C).  IIRC we spent a lot of time trying to do a better job than

> we're currently doing and there were always enough edge cases that we

> decided to just make the user deal with these issues explicitly.

>

> If you've got an idea of how to do a better job I'm all ears, just

> warning that this might be more frusturation than it's worth.


Thanks, this issue really really frustrated me for a long time, haha.
The priority of this isn't high, so I just record it as one of the
TODO, and don't want to give up it.  Maybe someday, or someone will
have amazing ideas that surprise us.

> > * Consider the testcase mapping-norelax-04b.s, there is a c.nop which is

> > used to do the alignment, but the rvc extension is disabled.  This may

> > looks weird and confuse user, so maybe we should treat these alignments

> > as MAP_DATA rather than MAP_INSN?  Or maybe treat as a new MAP_ALIGNMNET?

>

> IMO that's the right way to go here.  The C instructions aren't valid

> without the C ISA, so they're not really quite instructions at that

> point.  As above there's always been a bunch of rough edges around this

> sort of thing, though, so it might be hard to make this work without

> causing a headache somewhere else.


Yeah, just consider one of these issue may be fine, but if we consider
all of them at the same time, then things will start to get out of
control...


Thanks again for the feedback
Nelson

Patch

diff --git a/bfd/cpu-riscv.c b/bfd/cpu-riscv.c
index 025e94afd34..6056fa0daa0 100644
--- a/bfd/cpu-riscv.c
+++ b/bfd/cpu-riscv.c
@@ -140,3 +140,12 @@  riscv_get_priv_spec_class_from_numbers (unsigned int major,
   RISCV_GET_PRIV_SPEC_CLASS (buf, class_t);
   *class = class_t;
 }
+
+/* Define mapping symbols for riscv.  */
+
+bool
+riscv_elf_is_mapping_symbols (const char *name)
+{
+  return (!strcmp (name, "$d")
+          || !strcmp (name, "$x"));
+}
diff --git a/bfd/cpu-riscv.h b/bfd/cpu-riscv.h
index cafaca23be0..ed5ee7e60d5 100644
--- a/bfd/cpu-riscv.h
+++ b/bfd/cpu-riscv.h
@@ -79,3 +79,6 @@  riscv_get_priv_spec_class_from_numbers (unsigned int,
 					unsigned int,
 					unsigned int,
 					enum riscv_spec_class *);
+
+extern bool
+riscv_elf_is_mapping_symbols (const char *);
diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index d2781f30993..98496c4d3e5 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -5106,6 +5106,20 @@  riscv_elf_obj_attrs_arg_type (int tag)
   return (tag & 1) != 0 ? ATTR_TYPE_FLAG_STR_VAL : ATTR_TYPE_FLAG_INT_VAL;
 }
 
+/* Do not choose mapping symbols as a function name.  */
+
+static bfd_size_type
+riscv_maybe_function_sym (const asymbol *sym,
+			  asection *sec,
+			  bfd_vma *code_off)
+{
+  if (sym->flags & BSF_LOCAL
+      && riscv_elf_is_mapping_symbols (sym->name))
+    return 0;
+
+  return _bfd_elf_maybe_function_sym (sym, sec, code_off);
+}
+
 /* PR27584, Omit local and empty symbols since they usually generated
    for pcrel relocations.  */
 
@@ -5113,7 +5127,8 @@  static bool
 riscv_elf_is_target_special_symbol (bfd *abfd, asymbol *sym)
 {
   return (!strcmp (sym->name, "")
-	  || _bfd_elf_is_local_label_name (abfd, sym->name));
+	  || _bfd_elf_is_local_label_name (abfd, sym->name)
+	  || riscv_elf_is_mapping_symbols (sym->name));
 }
 
 #define TARGET_LITTLE_SYM			riscv_elfNN_vec
@@ -5144,6 +5159,7 @@  riscv_elf_is_target_special_symbol (bfd *abfd, asymbol *sym)
 #define elf_backend_grok_psinfo			riscv_elf_grok_psinfo
 #define elf_backend_object_p			riscv_elf_object_p
 #define elf_backend_write_core_note		riscv_write_core_note
+#define elf_backend_maybe_function_sym		riscv_maybe_function_sym
 #define elf_info_to_howto_rel			NULL
 #define elf_info_to_howto			riscv_info_to_howto_rela
 #define bfd_elfNN_bfd_relax_section		_bfd_riscv_relax_section
diff --git a/binutils/testsuite/binutils-all/readelf.s b/binutils/testsuite/binutils-all/readelf.s
index 6ae4dc756b9..99c2a3a2edd 100644
--- a/binutils/testsuite/binutils-all/readelf.s
+++ b/binutils/testsuite/binutils-all/readelf.s
@@ -12,6 +12,7 @@  Section Headers:
  +\[ .\] .* +PROGBITS +00000000 0000(3c|40|44|48|50) 0000(04|10) 00 +WA +0 +0 +(.|..)
  +\[ .\] .* +NOBITS +00000000 0000(40|44|48|4c|60) 000000 00 +WA +0 +0 +(.|..)
 # ARM targets put .ARM.attributes here
+# RISC-V targets may put .riscv.attributes here
 # MIPS targets put .reginfo, .mdebug, .MIPS.abiflags and .gnu.attributes here.
 # v850 targets put .call_table_data and .call_table_text here.
 #...
diff --git a/binutils/testsuite/binutils-all/readelf.s-64 b/binutils/testsuite/binutils-all/readelf.s-64
index 92ec05f0376..4d3125e6c2f 100644
--- a/binutils/testsuite/binutils-all/readelf.s-64
+++ b/binutils/testsuite/binutils-all/readelf.s-64
@@ -14,11 +14,14 @@  Section Headers:
  +\[ 4\] .bss +NOBITS +0000000000000000 +000000(4c|50|54|58|68)
  +0000000000000000 +0000000000000000 +WA +0 +0 +.*
 # x86 targets may put .note.gnu.property here.
+# riscv targets may put .riscv.attributes here.
 #...
  +\[ .\] .symtab +SYMTAB +0000000000000000 +0+.*
 # aarch64-elf targets have one more data symbol.
 # x86 targets may have .note.gnu.property.
- +0+.* +0000000000000018 +(6|7) +(3|4) +8
+# riscv targets have two more symbols,
+# one is mapping data symbol, the other is .riscv.attributes.
+ +0+.* +0000000000000018 +(6|7) +(3|4|5) +8
  +\[ .\] .strtab +STRTAB +0000000000000000 +0+.*
  +0+.* +0000000000000000 .* +0 +0 +1
  +\[ .\] .shstrtab +STRTAB +0000000000000000 +[0-9a-f]+
diff --git a/binutils/testsuite/binutils-all/readelf.s-64-unused b/binutils/testsuite/binutils-all/readelf.s-64-unused
index a1e6cd1bbd8..5b638047c48 100644
--- a/binutils/testsuite/binutils-all/readelf.s-64-unused
+++ b/binutils/testsuite/binutils-all/readelf.s-64-unused
@@ -14,11 +14,14 @@  Section Headers:
  +\[ 4\] .bss +NOBITS +0000000000000000 +000000(4c|50|54|58)
  +0000000000000000 +0000000000000000 +WA +0 +0 +.*
 # x86 targets may put .note.gnu.property here.
+# riscv targets may put .riscv.attributes here.
 #...
  +\[ .\] .symtab +SYMTAB +0000000000000000 +0+.*
 # aarch64-elf targets have one more data symbol.
 # x86 targets may have .note.gnu.property.
- +0+.* +0000000000000018 +(6|7) +(6|7) +8
+# riscv targets have two more symbols,
+# one is mapping data symbol, the other is .riscv.attributes.
+ +0+.* +0000000000000018 +(6|7) +(6|7|8) +8
  +\[ .\] .strtab +STRTAB +0000000000000000 +0+.*
  +0+.* +0000000000000000 .* +0 +0 +1
  +\[ .\] .shstrtab +STRTAB +0000000000000000 +[0-9a-f]+
diff --git a/binutils/testsuite/binutils-all/readelf.ss b/binutils/testsuite/binutils-all/readelf.ss
index 5fbb5d002e3..0d37c4def78 100644
--- a/binutils/testsuite/binutils-all/readelf.ss
+++ b/binutils/testsuite/binutils-all/readelf.ss
@@ -5,10 +5,12 @@  Symbol table '.symtab' contains .* entries:
  +1: 00000000 +0 +NOTYPE +LOCAL +DEFAULT +1 static_text_symbol
 # ARM targets add the $d mapping symbol here...
 # NDS32 targets add the $d2 mapping symbol here...
+# RISC-V targets add the $d mapping symbol here...
 #...
  +.: 00000000 +0 +NOTYPE +LOCAL +DEFAULT +[34] static_data_symbol
 # v850 targets include extra SECTION symbols here for the .call_table_data
 # and .call_table_text sections.
+# riscv targets may add the .riscv.attributes section symbol here...
 #...
  +[0-9]+: 00000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol
  +[0-9]+: 00000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol
diff --git a/binutils/testsuite/binutils-all/readelf.ss-64 b/binutils/testsuite/binutils-all/readelf.ss-64
index 99a732f71f5..288c9f378ac 100644
--- a/binutils/testsuite/binutils-all/readelf.ss-64
+++ b/binutils/testsuite/binutils-all/readelf.ss-64
@@ -4,12 +4,14 @@  Symbol table '.symtab' contains .* entries:
  +0: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +UND 
  +1: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +1 static_text_symbol
 # aarch64-elf targets add the $d mapping symbol here...
+# riscv targets add the $d mapping symbol here...
 #...
  +.: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +3 static_data_symbol
+# riscv targets may add the .riscv.attributes section symbol here...
 # ... or here ...
 #...
-.* +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol
- +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol
- +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol
+.* +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol
+ +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol
+ +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol
  +[0-9]+: 0000000000000004 +4 +(COMMON|OBJECT) +GLOBAL +DEFAULT +COM common_symbol
 #pass
diff --git a/binutils/testsuite/binutils-all/readelf.ss-64-unused b/binutils/testsuite/binutils-all/readelf.ss-64-unused
index f48a4b2bbd2..8e96d682dc3 100644
--- a/binutils/testsuite/binutils-all/readelf.ss-64-unused
+++ b/binutils/testsuite/binutils-all/readelf.ss-64-unused
@@ -7,12 +7,14 @@  Symbol table '.symtab' contains .* entries:
  +3: 0000000000000000 +0 +SECTION +LOCAL +DEFAULT +4.*
  +4: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +1 static_text_symbol
 # aarch64-elf targets add the $d mapping symbol here...
+# riscv targets add the $d mapping symbol here...
 #...
  +.: 0000000000000000 +0 +NOTYPE +LOCAL +DEFAULT +3 static_data_symbol
+# riscv targets may add the .riscv.attributes section symbol here...
 # ... or here ...
 #...
-.* +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol
- +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol
- +.: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol
+.* +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +1 text_symbol
+ +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +UND external_symbol
+ +[0-9]+: 0000000000000000 +0 +NOTYPE +GLOBAL +DEFAULT +3 data_symbol
  +[0-9]+: 0000000000000004 +4 +(COMMON|OBJECT) +GLOBAL +DEFAULT +COM common_symbol
 #pass
diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
index 42e57529369..fb52aa17e72 100644
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -553,6 +553,199 @@  static bool explicit_priv_attr = false;
 
 static char *expr_end;
 
+/* Create a new mapping symbol for the transition to STATE.  */
+
+static void
+make_mapping_symbol (enum riscv_seg_mstate state,
+		     valueT value,
+		     fragS *frag)
+{
+  struct riscv_frag_mapping_symbol *last, *new;
+  symbolS *symbol;
+  const char *name;
+  int type;
+
+  switch (state)
+    {
+    case MAP_DATA:
+      name = "$d";
+      type = BSF_NO_FLAGS;
+      break;
+    case MAP_INSN:
+      name = "$x";
+      type = BSF_NO_FLAGS;
+      break;
+    default:
+      abort ();
+    }
+
+  symbol = symbol_new (name, now_seg, frag, value);
+  symbol_get_bfdsym (symbol)->flags |= type | BSF_LOCAL;
+
+  last = frag->tc_frag_data.map_syms;
+  while (last && last->next != NULL)
+    last = last->next;
+  if (last != NULL
+      && last->symbol != NULL)
+    {
+      /* The mapping symbols should be added in offset order.  */
+      know (S_GET_VALUE (last->symbol) <= S_GET_VALUE (symbol));
+
+      /* Replace the last mapping symbol with the new one at
+	 the same offset.  */
+      if (S_GET_VALUE (last->symbol) == S_GET_VALUE (symbol))
+	{
+	  symbol_remove (last->symbol, &symbol_rootP, &symbol_lastP);
+	  last->state = state;
+	  last->symbol = symbol;
+	  return;
+	}
+    }
+
+  new = XNEW (struct riscv_frag_mapping_symbol);
+  new->state = state;
+  new->symbol = symbol;
+  new->next = NULL;
+  if (frag->tc_frag_data.map_syms == NULL)
+    frag->tc_frag_data.map_syms = new;
+  else
+    last->next = new;
+}
+
+/* Set the mapping state for frag_now.  */
+
+void
+riscv_mapping_state (enum riscv_seg_mstate to_state,
+		     int max_chars)
+{
+  enum riscv_seg_mstate from_state =
+	seg_info (now_seg)->tc_segment_info_data.map_state;
+
+  /* We only add mapping symbols to the normal section.  */
+  if (!SEG_NORMAL (now_seg))
+    return;
+
+  /* The mapping symbol should be emitted.  */
+  if (from_state == to_state)
+    return;
+
+  /* This should be a data section, don't do anything if we still generate
+     data in this section.  */
+  if (from_state == MAP_NONE
+      && to_state == MAP_DATA
+      && !subseg_text_p (now_seg))
+    return;
+
+  /* If the instruction isn't generated at the start of the section, then
+     we need to add the MAP_DATA at the beginnig.  */
+  if (from_state == MAP_NONE
+      && to_state == MAP_INSN)
+    {
+      struct frag *const frag_first = seg_info (now_seg)->frchainP->frch_root;
+      if (frag_now != frag_first || frag_now_fix () > 0)
+	make_mapping_symbol (MAP_DATA, (valueT) 0, frag_first);
+    }
+
+  seg_info (now_seg)->tc_segment_info_data.map_state = to_state;
+  make_mapping_symbol (to_state, (valueT) frag_now_fix () - max_chars, frag_now);
+}
+
+/* Add the odd bytes of paddings for riscv_handle_align.  */
+
+static void
+riscv_add_odd_padding_symbol (fragS *frag)
+{
+  /* The MAP_DATA may be added redundantly, if the previous mapping
+     symbol is also MAP_DATA.  The riscv_check_mapping_symbols will
+     help to remove it.  */
+  make_mapping_symbol (MAP_DATA, frag->fr_fix, frag);
+  make_mapping_symbol (MAP_INSN, frag->fr_fix + 1, frag);
+}
+
+/* Remove all overlapped and redundant mapping symbols.  */
+
+static void
+riscv_check_mapping_symbols (bfd *abfd ATTRIBUTE_UNUSED,
+			     asection *sec,
+			     void *dummy ATTRIBUTE_UNUSED)
+{
+  segment_info_type *seginfo = seg_info (sec);
+  fragS *fragp;
+  enum riscv_seg_mstate state = MAP_NONE;
+
+  if (seginfo == NULL || seginfo->frchainP == NULL)
+    return;
+
+  for (fragp = seginfo->frchainP->frch_root;
+       fragp != NULL;
+       fragp = fragp->fr_next)
+    {
+      struct riscv_frag_mapping_symbol *last, *temp;
+      temp = fragp->tc_frag_data.map_syms;
+      last = temp;
+      for (; temp != NULL; temp = temp->next)
+	{
+	  /* The mapping state is same as previous one, so remove it.  */
+	  if (state == temp->state)
+	    {
+	      symbol_remove (temp->symbol, &symbol_rootP, &symbol_lastP);
+	      temp->symbol = NULL;
+	      continue;
+	    }
+	  state = temp->state;
+	  last = temp;
+	}
+
+      if (last == NULL || last->symbol == NULL)
+	continue;
+
+      /* Check the last mapping symbol if it is at the boundary of
+	 fragmnet.  */
+      fragS *next = fragp->fr_next;
+      if (S_GET_VALUE (last->symbol) < next->fr_address)
+	continue;
+      know (S_GET_VALUE (last->symbol) == next->fr_address);
+
+      /* Since we may have empty frags without any mapping symbols,
+	 keep looking until the non-empty frag.  */
+      for (; next != NULL; next = next->fr_next)
+	{
+	  temp = next->tc_frag_data.map_syms;
+	  if (temp != NULL
+	      && temp->symbol != NULL
+	      && (S_GET_VALUE (temp->symbol) == next->fr_address))
+	    {
+	      /* The last mapping symbol overlaps with another one
+		 which at the start of the next frag.  */
+	      symbol_remove (last->symbol, &symbol_rootP, &symbol_lastP);
+	      last->symbol = NULL;
+	      break;
+	    }
+
+	  if (next->fr_next == NULL
+	      && next->fr_fix == 0 && next->fr_var == 0)
+	    {
+	      /* The last mapping symbol is at the end of the section.  */
+	      symbol_remove (last->symbol, &symbol_rootP, &symbol_lastP);
+	      last->symbol = NULL;
+	      break;
+	    }
+
+	  if (next->fr_address != next->fr_next->fr_address)
+	    break;
+	}
+
+      /* This frag won't be checked, so free the unused information.  */
+      temp = fragp->tc_frag_data.map_syms;
+      while (temp != NULL)
+	{
+	  last = temp;
+	  temp = temp->next;
+	  free (last);
+	}
+    }
+}
+
 /* The default target format to use.  */
 
 const char *
@@ -2759,6 +2952,8 @@  md_assemble (char *str)
        return;
     }
 
+  riscv_mapping_state (MAP_INSN, 0);
+
   const char *error = riscv_ip (str, &insn, &imm_expr, &imm_reloc, op_hash);
 
   if (error)
@@ -3424,6 +3619,8 @@  riscv_frag_align_code (int n)
   fix_new_exp (frag_now, nops - frag_now->fr_literal, 0,
 	       &ex, false, BFD_RELOC_RISCV_ALIGN);
 
+  riscv_mapping_state (MAP_INSN, worst_case_bytes);
+
   return true;
 }
 
@@ -3443,6 +3640,7 @@  riscv_handle_align (fragS *fragP)
 	  /* We have 4 byte uncompressed nops.  */
 	  bfd_signed_vma size = 4;
 	  bfd_signed_vma excess = bytes % size;
+	  bfd_boolean odd_padding = (excess % 2 == 1);
 	  char *p = fragP->fr_literal + fragP->fr_fix;
 
 	  if (bytes <= 0)
@@ -3451,12 +3649,20 @@  riscv_handle_align (fragS *fragP)
 	  /* Insert zeros or compressed nops to get 4 byte alignment.  */
 	  if (excess)
 	    {
+	      if (odd_padding)
+		riscv_add_odd_padding_symbol (fragP);
 	      riscv_make_nops (p, excess);
 	      fragP->fr_fix += excess;
 	      p += excess;
 	    }
 
-	  /* Insert variable number of 4 byte uncompressed nops.  */
+	  /* The frag will be changed to `rs_fill` later.  The function
+	     `write_contents` will try to fill the remaining spaces
+	     according to the patterns we give.  In this case, we give
+	     a 4 byte uncompressed nop as the pattern, and set the size
+	     of the pattern into `fr_var`.  The nop will be output to the
+	     file `fr_offset` times.  However, `fr_offset` could be zero
+	     if we don't need to pad the boundary finally.  */
 	  riscv_make_nops (p, size);
 	  fragP->fr_var = size;
 	}
@@ -3467,6 +3673,30 @@  riscv_handle_align (fragS *fragP)
     }
 }
 
+/* This usually called from frag_var.  */
+
+void
+riscv_init_frag (fragS * fragP, int max_chars)
+{
+  /* Do not add mapping symbol to debug sections.  */
+  if (bfd_section_flags (now_seg) & SEC_DEBUGGING)
+    return;
+
+  switch (fragP->fr_type)
+    {
+    case rs_fill:
+    case rs_align:
+    case rs_align_test:
+      riscv_mapping_state (MAP_DATA, max_chars);
+      break;
+    case rs_align_code:
+      riscv_mapping_state (MAP_INSN, max_chars);
+      break;
+    default:
+      break;
+    }
+}
+
 int
 md_estimate_size_before_relax (fragS *fragp, asection *segtype)
 {
@@ -3723,6 +3953,8 @@  s_riscv_insn (int x ATTRIBUTE_UNUSED)
   save_c = *input_line_pointer;
   *input_line_pointer = '\0';
 
+  riscv_mapping_state (MAP_INSN, 0);
+
   const char *error = riscv_ip (str, &insn, &imm_expr,
 				&imm_reloc, insn_type_hash);
 
@@ -3813,6 +4045,15 @@  riscv_md_end (void)
   riscv_set_public_attributes ();
 }
 
+/* Adjust the symbol table.  */
+
+void
+riscv_adjust_symtab (void)
+{
+  bfd_map_over_sections (stdoutput, riscv_check_mapping_symbols, (char *) 0);
+  elf_adjust_symtab ();
+}
+
 /* Given a symbolic attribute NAME, return the proper integer value.
    Returns -1 if the attribute is not known.  */
 
diff --git a/gas/config/tc-riscv.h b/gas/config/tc-riscv.h
index 1de138458d8..c7dbf215537 100644
--- a/gas/config/tc-riscv.h
+++ b/gas/config/tc-riscv.h
@@ -128,4 +128,34 @@  extern void riscv_elf_final_processing (void);
 extern void riscv_md_end (void);
 extern int riscv_convert_symbolic_attribute (const char *);
 
+/* Set mapping symbol states.  */
+#define md_cons_align(nbytes) riscv_mapping_state (MAP_DATA, 0)
+void riscv_mapping_state (enum riscv_seg_mstate, int);
+
+#define TC_SEGMENT_INFO_TYPE struct riscv_segment_info_type
+struct riscv_segment_info_type
+{
+  enum riscv_seg_mstate map_state;
+};
+
+struct riscv_frag_mapping_symbol
+{
+  enum riscv_seg_mstate state;
+  symbolS *symbol;
+  struct riscv_frag_mapping_symbol *next;
+};
+
+/* Define target fragment type.  */
+#define TC_FRAG_TYPE struct riscv_frag_type
+struct riscv_frag_type
+{
+  struct riscv_frag_mapping_symbol *map_syms;
+};
+
+#define TC_FRAG_INIT(fragp, max_bytes) riscv_init_frag (fragp, max_bytes)
+extern void riscv_init_frag (struct frag *, int);
+
+#define obj_adjust_symtab() riscv_adjust_symtab ()
+extern void riscv_adjust_symtab (void);
+
 #endif /* TC_RISCV */
diff --git a/gas/testsuite/gas/riscv/mapping-01.s b/gas/testsuite/gas/riscv/mapping-01.s
new file mode 100644
index 00000000000..0a270454b1b
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-01.s
@@ -0,0 +1,18 @@ 
+	.option norvc
+	.text
+	.global	funcA
+funcA:
+	nop
+	j	funcB
+
+	.global	funcB
+funcB:
+	nop
+	bne	a0, a1, funcB
+
+	.data
+	.word 0x123456
+
+	.section	.foo, "ax"
+foo:
+	nop
diff --git a/gas/testsuite/gas/riscv/mapping-01a.d b/gas/testsuite/gas/riscv/mapping-01a.d
new file mode 100644
index 00000000000..5cd036afbeb
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-01a.d
@@ -0,0 +1,18 @@ 
+#as:
+#source: mapping-01.s
+#objdump: --syms --special-syms
+
+.*file format.*riscv.*
+
+SYMBOL TABLE:
+0+00 l    d  .text	0+00 .text
+0+00 l    d  .data	0+00 .data
+0+00 l    d  .bss	0+00 .bss
+0+00 l       .text	0+00 \$x
+0+00 l    d  .foo	0+00 .foo
+0+00 l       .foo	0+00 foo
+0+00 l       .foo	0+00 \$x
+# Maybe section symbol for .riscv.attributes
+#...
+0+00 g       .text	0+00 funcA
+0+08 g       .text	0+00 funcB
diff --git a/gas/testsuite/gas/riscv/mapping-01b.d b/gas/testsuite/gas/riscv/mapping-01b.d
new file mode 100644
index 00000000000..723bd5951cb
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-01b.d
@@ -0,0 +1,21 @@ 
+#as:
+#source: mapping-01.s
+#objdump: -d
+
+.*:[ 	]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <funcA>:
+[ 	]+0:[ 	]+00000013[ 	]+nop
+[ 	]+4:[ 	]+0040006f[ 	]+j[ 	]+8 <funcB>
+
+0+008 <funcB>:
+[ 	]+8:[ 	]+00000013[ 	]+nop
+[ 	]+c:[ 	]+feb51ee3[ 	]+bne[ 	]+a0,a1,8 <funcB>
+
+Disassembly of section .foo:
+
+0+000 <foo>:
+[ 	]+0:[ 	]+00000013[ 	]+nop
diff --git a/gas/testsuite/gas/riscv/mapping-02.s b/gas/testsuite/gas/riscv/mapping-02.s
new file mode 100644
index 00000000000..d3de6dedac8
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-02.s
@@ -0,0 +1,13 @@ 
+	.option norvc
+	.text
+	.global main
+	.type   main, %function
+main:
+	nop
+foo:
+	nop
+	nop
+	.size   main, .-main
+	.ident  ""
+
+	nop
diff --git a/gas/testsuite/gas/riscv/mapping-02a.d b/gas/testsuite/gas/riscv/mapping-02a.d
new file mode 100644
index 00000000000..a60311cb0c1
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-02a.d
@@ -0,0 +1,16 @@ 
+#as:
+#source: mapping-02.s
+#objdump: --syms --special-syms
+
+.*file format.*riscv.*
+
+SYMBOL TABLE:
+0+00 l    d  .text	0+00 .text
+0+00 l    d  .data	0+00 .data
+0+00 l    d  .bss	0+00 .bss
+0+00 l       .text	0+00 \$x
+0+04 l       .text	0+00 foo
+0+00 l    d  .comment	0+00 .comment
+# Maybe section symbol for .riscv.attributes
+#...
+0+00 g     F .text	0+0c main
diff --git a/gas/testsuite/gas/riscv/mapping-02b.d b/gas/testsuite/gas/riscv/mapping-02b.d
new file mode 100644
index 00000000000..3e875600a9c
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-02b.d
@@ -0,0 +1,16 @@ 
+#as:
+#source: mapping-02.s
+#objdump: -d
+
+.*:[ 	]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <main>:
+[ 	]+0:[ 	]+00000013[ 	]+nop
+
+0+004 <foo>:
+[ 	]+4:[ 	]+00000013[ 	]+nop
+[ 	]+8:[ 	]+00000013[ 	]+nop
+[ 	]+c:[ 	]+00000013[ 	]+nop
diff --git a/gas/testsuite/gas/riscv/mapping-03.s b/gas/testsuite/gas/riscv/mapping-03.s
new file mode 100644
index 00000000000..49964f20288
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-03.s
@@ -0,0 +1,10 @@ 
+	.option norvc
+	.text
+	.word 0
+	nop
+	.data
+	.word 0
+	.text
+	nop
+	.short 0
+	nop
diff --git a/gas/testsuite/gas/riscv/mapping-03a.d b/gas/testsuite/gas/riscv/mapping-03a.d
new file mode 100644
index 00000000000..563db73994a
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-03a.d
@@ -0,0 +1,16 @@ 
+#as:
+#source: mapping-03.s
+#objdump: --syms --special-syms
+
+.*file format.*riscv.*
+
+SYMBOL TABLE:
+0+00 l    d  .text	0+00 .text
+0+00 l    d  .data	0+00 .data
+0+00 l    d  .bss	0+00 .bss
+0+00 l       .text	0+00 \$d
+0+04 l       .text	0+00 \$x
+0+0c l       .text	0+00 \$d
+0+0e l       .text	0+00 \$x
+# Maybe section symbol for .riscv.attributes
+#...
diff --git a/gas/testsuite/gas/riscv/mapping-03b.d b/gas/testsuite/gas/riscv/mapping-03b.d
new file mode 100644
index 00000000000..ed87fcf33d2
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-03b.d
@@ -0,0 +1,16 @@ 
+#as:
+#source: mapping-03.s
+#objdump: -d
+
+.*:[ 	]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <.text>:
+[ 	]+0:[ 	]+00000000[ 	]+.word[ 	]+0x00000000
+[ 	]+4:[ 	]+00000013[ 	]+nop
+[ 	]+8:[ 	]+00000013[ 	]+nop
+[ 	]+c:[ 	]+0000[ 	]+.short[ 	]+0x0000
+[ 	]+e:[ 	]+00000013[ 	]+nop
+#...
diff --git a/gas/testsuite/gas/riscv/mapping-04.s b/gas/testsuite/gas/riscv/mapping-04.s
new file mode 100644
index 00000000000..36bfac11050
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-04.s
@@ -0,0 +1,11 @@ 
+	.option norvc
+	.text
+	nop
+	.long	0
+	.align	4
+	.word	0x12345678
+	nop
+	.byte	1
+	.long	0
+	.align	4
+	.word	0x12345678
diff --git a/gas/testsuite/gas/riscv/mapping-04a.d b/gas/testsuite/gas/riscv/mapping-04a.d
new file mode 100644
index 00000000000..048518beb9a
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-04a.d
@@ -0,0 +1,21 @@ 
+#as:
+#source: mapping-04.s
+#objdump: --syms --special-syms
+
+.*file format.*riscv.*
+
+SYMBOL TABLE:
+0+00 l    d  .text	0+00 .text
+0+00 l    d  .data	0+00 .data
+0+00 l    d  .bss	0+00 .bss
+0+00 l       .text	0+00 \$x
+0+04 l       .text	0+00 \$d
+0+08 l       .text	0+00 \$x
+0+14 l       .text	0+00 \$d
+0+18 l       .text	0+00 \$x
+0+1c l       .text	0+00 \$d
+0+21 l       .text	0+00 \$x
+0+2d l       .text	0+00 \$d
+0+31 l       .text	0+00 \$x
+# Maybe section symbol for .riscv.attributes
+#...
diff --git a/gas/testsuite/gas/riscv/mapping-04b.d b/gas/testsuite/gas/riscv/mapping-04b.d
new file mode 100644
index 00000000000..64d9f558753
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-04b.d
@@ -0,0 +1,24 @@ 
+#as:
+#source: mapping-04.s
+#objdump: -d
+
+.*:[ 	]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <.text>:
+[ 	]+0:[ 	]+00000013[ 	]+nop
+[ 	]+4:[ 	]+00000000[ 	]+.word[ 	]+0x00000000
+[ 	]+8:[ 	]+00000013[ 	]+nop
+[ 	]+c:[ 	]+00000013[ 	]+nop
+[ 	]+10:[ 	]+00000013[ 	]+nop
+[ 	]+14:[ 	]+12345678[ 	]+.word[ 	]+0x12345678
+[ 	]+18:[ 	]+00000013[ 	]+nop
+[ 	]+1c:[ 	]+00000001[ 	]+.word[ 	]+0x00000001
+[ 	]+20:[ 	]+00[ 	]+.byte[ 	]+0x00
+[ 	]+21:[ 	]+00000013[ 	]+nop
+[ 	]+25:[ 	]+00000013[ 	]+nop
+[ 	]+29:[ 	]+00000013[ 	]+nop
+[ 	]+2d:[ 	]+12345678[ 	]+.word[ 	]+0x12345678
+#...
diff --git a/gas/testsuite/gas/riscv/mapping-norelax-04a.d b/gas/testsuite/gas/riscv/mapping-norelax-04a.d
new file mode 100644
index 00000000000..0792068890e
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-norelax-04a.d
@@ -0,0 +1,21 @@ 
+#as: -mno-relax
+#source: mapping-04.s
+#objdump: --syms --special-syms
+
+.*file format.*riscv.*
+
+SYMBOL TABLE:
+0+00 l    d  .text	0+00 .text
+0+00 l    d  .data	0+00 .data
+0+00 l    d  .bss	0+00 .bss
+0+00 l       .text	0+00 \$x
+0+04 l       .text	0+00 \$d
+0+08 l       .text	0+00 \$x
+0+10 l       .text	0+00 \$d
+0+14 l       .text	0+00 \$x
+0+18 l       .text	0+00 \$d
+0+20 l       .text	0+00 \$d
+0+24 l       .text	0+00 \$x
+0+1e l       .text	0+00 \$x
+# Maybe section symbol for .riscv.attributes
+#...
diff --git a/gas/testsuite/gas/riscv/mapping-norelax-04b.d b/gas/testsuite/gas/riscv/mapping-norelax-04b.d
new file mode 100644
index 00000000000..71484bbf4c3
--- /dev/null
+++ b/gas/testsuite/gas/riscv/mapping-norelax-04b.d
@@ -0,0 +1,24 @@ 
+#as: -mno-relax
+#source: mapping-04.s
+#objdump: -d
+
+.*:[ 	]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <.text>:
+[ 	]+0:[ 	]+00000013[ 	]+nop
+[ 	]+4:[ 	]+00000000[ 	]+.word[ 	]+0x00000000
+[ 	]+8:[ 	]+00000013[ 	]+nop
+[ 	]+c:[ 	]+00000013[ 	]+nop
+[ 	]+10:[ 	]+12345678[ 	]+.word[ 	]+0x12345678
+[ 	]+14:[ 	]+00000013[ 	]+nop
+[ 	]+18:[ 	]+00000001[ 	]+.word[ 	]+0x00000001
+[ 	]+1c:[ 	]+0000[ 	]+.short[ 	]+0x0000
+[ 	]+1e:[ 	]+0001[ 	]+nop
+[ 	]+20:[ 	]+12345678[ 	]+.word[ 	]+0x12345678
+[ 	]+24:[ 	]+00000013[ 	]+nop
+[ 	]+28:[ 	]+00000013[ 	]+nop
+[ 	]+2c:[ 	]+00000013[ 	]+nop
+#...
diff --git a/gas/testsuite/gas/riscv/no-relax-align-2.d b/gas/testsuite/gas/riscv/no-relax-align-2.d
index 7407b495a8f..0c7280558f0 100644
--- a/gas/testsuite/gas/riscv/no-relax-align-2.d
+++ b/gas/testsuite/gas/riscv/no-relax-align-2.d
@@ -7,7 +7,7 @@ 
 Disassembly of section .text:
 
 0+000 <.text>:
-[ 	]+0:[ 	]+0000[ 	]+unimp
+[ 	]+0:[ 	]+0000[ 	]+.short[ 	]+0x0000
 [ 	]+2:[ 	]+0001[ 	]+nop
 [ 	]+4:[ 	]+00000013[ 	]+nop
 [ 	]+8:[ 	]+00000013[ 	]+nop
diff --git a/include/opcode/riscv.h b/include/opcode/riscv.h
index fdf3df4f5c1..5291ab7e4e7 100644
--- a/include/opcode/riscv.h
+++ b/include/opcode/riscv.h
@@ -425,6 +425,13 @@  enum
   M_NUM_MACROS
 };
 
+/* The mapping symbol states.  */
+enum riscv_seg_mstate
+{
+  MAP_NONE = 0, /* Must be zero, for seginfo in new sections.  */
+  MAP_DATA,
+  MAP_INSN,
+};
 
 extern const char * const riscv_gpr_names_numeric[NGPR];
 extern const char * const riscv_gpr_names_abi[NGPR];
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index fe8dfb88d90..89aa6e59244 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -41,6 +41,10 @@  struct riscv_private_data
   bfd_vma hi_addr[OP_MASK_RD + 1];
 };
 
+static int last_map_symbol = -1;
+static bfd_vma last_stop_offset = 0;
+enum riscv_seg_mstate last_map_state;
+
 static const char * const *riscv_gpr_names;
 static const char * const *riscv_fpr_names;
 
@@ -556,13 +560,189 @@  riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
   return insnlen;
 }
 
+static bool
+riscv_get_map_state (int n,
+		     enum riscv_seg_mstate *state,
+		     struct disassemble_info *info)
+{
+  const char *name;
+
+  /* If the symbol is in a different section, ignore it.  */
+  if (info->section != NULL
+      && info->section != info->symtab[n]->section)
+    return false;
+
+  name = bfd_asymbol_name(info->symtab[n]);
+  if (strcmp (name, "$x") == 0)
+    *state = MAP_INSN;
+  else if (strcmp (name, "$d") == 0)
+    *state = MAP_DATA;
+  else
+    return false;
+
+  return true;
+}
+
+/* Check the sorted symbol table (sorted by the symbol value), find the
+   suitable mapping symbols.  */
+
+static enum riscv_seg_mstate
+riscv_search_mapping_symbol (bfd_vma memaddr,
+			     struct disassemble_info *info)
+{
+  enum riscv_seg_mstate mstate;
+  bool from_last_map_symbol;
+  bool found = false;
+  int symbol = -1;
+  int n;
+
+  /* Decide whether to print the data or instruction by default, in case
+     we can not find the corresponding mapping symbols.  */
+  mstate = MAP_DATA;
+  if ((info->section
+       && info->section->flags & SEC_CODE)
+      || !info->section)
+    mstate = MAP_INSN;
+
+  /* Return default mapping state if there are no suitable symbols.  */
+  if (info->symtab_size == 0
+      || bfd_asymbol_flavour (*info->symtab) != bfd_target_elf_flavour)
+    return mstate;
+
+  /* Reset the last_map_symbol if we start to dump a new section.  */
+  if (memaddr <= 0)
+    last_map_symbol = -1;
+
+  /* If the last stop offset is different from the current one, then
+     don't use the last_map_symbol to search.  We usually reset the
+     info->stop_offset when handling a new section.  */
+  from_last_map_symbol = (last_map_symbol >= 0
+			  && info->stop_offset == last_stop_offset);
+
+  /* Start scanning at the info->symtab_pos or the last_map_symbol.
+     Try to find the suitable mapping symbol until the current pc.  */
+  n = info->symtab_pos + 1;
+  if (from_last_map_symbol && n >= last_map_symbol)
+    n = last_map_symbol;
+
+  /* Find the suitable mapping symbol to dump.  */
+  for (; n < info->symtab_size; n++)
+    {
+      bfd_vma addr = bfd_asymbol_value (info->symtab[n]);
+      /* We have searched all possible symbols in the range.  */
+      if (addr > memaddr)
+	break;
+      /* Do not stop searching, in case there are some mapping
+        symbols have the same value, but have different names.
+        Use the last one.  */
+      if (riscv_get_map_state (n, &mstate, info))
+	{
+	  symbol = n;
+	  found = true;
+	}
+    }
+
+  /* We can not find the suitable mapping symbol above.  Therefore, we
+     look forwards and try to find it again, but don't go pass the start
+     of the section.  Otherwise a data section without mapping symbols
+     can pick up a text mapping symbol of a preceeding section.  */
+  if (!found)
+    {
+      n = info->symtab_pos;
+      if (from_last_map_symbol && n >= last_map_symbol)
+      n = last_map_symbol;
+
+      for (; n >= 0; n--)
+	{
+	  bfd_vma addr = bfd_asymbol_value (info->symtab[n]);
+	  /* We have searched all possible symbols in the range.  */
+	  if (addr < (info->section ? info->section->vma : 0))
+	    break;
+	  /* Stop searching once we find the closed mapping symbol.  */
+	  if (riscv_get_map_state (n, &mstate, info))
+	    {
+	      symbol = n;
+	      found = true;
+	      break;
+	    }
+	}
+    }
+
+  /* Save the information for next use.  */
+  last_map_symbol = symbol;
+  last_stop_offset = info->stop_offset;
+
+  return mstate;
+}
+
+/* Decide which data size we should print.  */
+
+static bfd_vma
+riscv_data_length (bfd_vma memaddr, struct disassemble_info *info)
+{
+  bfd_vma size = 4;
+  int n;
+
+  /* Return default mapping state if there are no suitable symbols.  */
+  if (info->symtab_size == 0
+      || bfd_asymbol_flavour (*info->symtab) != bfd_target_elf_flavour
+      || last_map_symbol < 0)
+    return size;
+
+  for (n = last_map_symbol + 1; n < info->symtab_size; n++)
+    {
+      bfd_vma addr = bfd_asymbol_value (info->symtab[n]);
+      if (addr > memaddr)
+	{
+	  if (addr - memaddr < size)
+	    size = addr - memaddr;
+	  break;
+	}
+    }
+  size = size == 3 ? 2 : size;
+
+  return size;
+}
+
+/* Dump the data contents.  */
+
+static int
+riscv_disassemble_data (bfd_vma memaddr ATTRIBUTE_UNUSED,
+			insn_t data,
+			disassemble_info *info)
+{
+  info->display_endian = info->endian;
+
+  switch (info->bytes_per_chunk)
+    {
+    case 1:
+      (*info->fprintf_func) (info->stream, ".byte\t0x%02llx",
+			     (unsigned long long) data);
+      break;
+    case 2:
+      (*info->fprintf_func) (info->stream, ".short\t0x%04llx",
+			     (unsigned long long) data);
+      break;
+    case 4:
+      (*info->fprintf_func) (info->stream, ".word\t0x%08llx",
+			     (unsigned long long) data);
+      break;
+    default:
+      abort ();
+    }
+  return info->bytes_per_chunk;
+}
+
 int
 print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
 {
-  bfd_byte packet[2];
+  bfd_byte packet[8];
   insn_t insn = 0;
-  bfd_vma n;
+  bfd_vma dump_size;
   int status;
+  enum riscv_seg_mstate mstate;
+  int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
+
 
   if (info->disassembler_options != NULL)
     {
@@ -573,23 +753,42 @@  print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
   else if (riscv_gpr_names == NULL)
     set_default_riscv_dis_options ();
 
-  /* Instructions are a sequence of 2-byte packets in little-endian order.  */
-  for (n = 0; n < sizeof (insn) && n < riscv_insn_length (insn); n += 2)
+  mstate = riscv_search_mapping_symbol (memaddr, info);
+  /* Save the last mapping state.  */
+  last_map_state = mstate;
+
+  /* Set the size to dump.  */
+  if (mstate == MAP_DATA &&
+      (info->flags & DISASSEMBLE_DATA) == 0)
+    {
+      dump_size = riscv_data_length (memaddr, info);
+      info->bytes_per_chunk = dump_size;
+      riscv_disassembler = riscv_disassemble_data;
+    }
+  else
     {
-      status = (*info->read_memory_func) (memaddr + n, packet, 2, info);
+      /* Get the first 2-bytes to check the lenghth of instruction.  */
+      status = (*info->read_memory_func) (memaddr, packet, 2, info);
       if (status != 0)
 	{
-	  /* Don't fail just because we fell off the end.  */
-	  if (n > 0)
-	    break;
 	  (*info->memory_error_func) (status, memaddr, info);
-	  return status;
+	  return 1;
 	}
+      insn = (insn_t) bfd_getl16 (packet);
+      dump_size = riscv_insn_length (insn);
+      riscv_disassembler = riscv_disassemble_insn;
+    }
 
-      insn |= ((insn_t) bfd_getl16 (packet)) << (8 * n);
+  /* Fetch the instruction to dump.  */
+  status = (*info->read_memory_func) (memaddr, packet, dump_size, info);
+  if (status != 0)
+    {
+      (*info->memory_error_func) (status, memaddr, info);
+      return 1;
     }
+  insn = (insn_t) bfd_get_bits (packet, dump_size * 8, false);
 
-  return riscv_disassemble_insn (memaddr, insn, info);
+  return (*riscv_disassembler) (memaddr, insn, info);
 }
 
 disassembler_ftype
@@ -631,7 +830,8 @@  riscv_symbol_is_valid (asymbol * sym,
 
   name = bfd_asymbol_name (sym);
 
-  return (strcmp (name, RISCV_FAKE_LABEL_NAME) != 0);
+  return (strcmp (name, RISCV_FAKE_LABEL_NAME) != 0
+	  && !riscv_elf_is_mapping_symbols (name));
 }
 
 void