[5/5] gdb/python: implement the print_insn extension language hook

Message ID a253ed5c6fc90849197223f904bd5a92f5fce2fb.1634162144.git.andrew.burgess@embecosm.com
State New
Headers show
Series
  • Add Python API for the disassembler
Related show

Commit Message

Andrew Burgess Oct. 13, 2021, 9:59 p.m.
This commit extends the Python API to include disassembler support,
and additionally provides a syntax highlighting disassembler.

The motivation for this commit was to provide an API by which the user
could write Python scripts that would augment the output of the
disassembler.

To achieve this I have followed the model of the existing libopcodes
disassembler, that is, instructions are disassembled one by one.  This
does restrict the type of things that it is possible to do from a
Python script, i.e. all additional output has to fit on a single line,
but this was all I needed, and creating something more complex would,
I think, require greater changes to how GDB's internal disassembler
operates.

It was only once I had a working prototype that I realised I could
very easily use this to perform syntax highlighting on GDB's
disassembly output, so I've included that too.  The new commands added
are:

  set style disassembly on|off
  show style disassembly

which enable or disable disassembly syntax highlighting.

The disassembler API is contained in the new gdb.disassembler module,
which defines the following classes:

  DisassembleInfo

      Similar to libopcodes disassemble_info structure, has read-only
  attributes: address, string, length, architecture, and
  can_emit_style_escape.  And has methods: read_memory, set_result,
  and memory_error.

      Each time GDB wants an instruction disassembled, an instance of
  this class is passed to a user written disassembler, by reading the
  attributes, and calling the methods, the user can perform
  disassembly, and set the result within the DisassembleInfo instance.

  Disassembler

      This is a base-class which user written disassemblers should
  inherit from, just provides base implementations of __init__ and
  __call__ which the user written disassembler should override.

The gdb.disassembler module also provides the following functions:

  register_disassembler

      This function registers an instance of a Disassembler sub-class
  as a disassembler, either for one specific architecture, or, as a
  global disassembler for all architectures.

  format_address

      This wraps GDB's print_address function, converting an address
  into a string that can be placed into disassembler output.

  syntax_highlight

      This adds syntax highlighting escapes to some disassembler
  output, users can call this from their own custom disassemblers to
  retain syntax highlighting, this function handles switching syntax
  highlighting off, or the case where the pygments library is not
  available.

  builtin_disassemble

      This provides access to GDB's builtin disassembler.  A common
  user case that I see is augmenting the existing disassembler
  output.  The user code can call this function to have GDB
  disassemble the instruction in the normal way, and then the user can
  tweak the output before returning that as the result.  This function
  also provides a mechanism to intercept the disassemblers reads of
  memory, thus the user can adjust what GDB sees when it is
  disassembling.

The included documentation provides a more detailed description of the
API.
---
 gdb/Makefile.in                        |   1 +
 gdb/NEWS                               |  42 ++
 gdb/data-directory/Makefile.in         |   1 +
 gdb/disasm.c                           |   5 +-
 gdb/disasm.h                           |  13 +-
 gdb/doc/gdb.texinfo                    |  14 +
 gdb/doc/python.texi                    | 252 +++++++
 gdb/python/lib/gdb/disassembler.py     | 194 ++++++
 gdb/python/py-arch.c                   |   9 +
 gdb/python/py-disasm.c                 | 905 +++++++++++++++++++++++++
 gdb/python/python-internal.h           |  21 +
 gdb/python/python.c                    |  11 +-
 gdb/testsuite/gdb.base/style.exp       |  45 +-
 gdb/testsuite/gdb.python/py-disasm.c   |  25 +
 gdb/testsuite/gdb.python/py-disasm.exp | 201 ++++++
 gdb/testsuite/gdb.python/py-disasm.py  | 538 +++++++++++++++
 16 files changed, 2267 insertions(+), 10 deletions(-)
 create mode 100644 gdb/python/lib/gdb/disassembler.py
 create mode 100644 gdb/python/py-disasm.c
 create mode 100644 gdb/testsuite/gdb.python/py-disasm.c
 create mode 100644 gdb/testsuite/gdb.python/py-disasm.exp
 create mode 100644 gdb/testsuite/gdb.python/py-disasm.py

-- 
2.25.4

Comments

Kuan-Ying Lee via Gdb-patches Oct. 14, 2021, 7:12 a.m. | #1
> From: Andrew Burgess <andrew.burgess@embecosm.com>

> Date: Wed, 13 Oct 2021 22:59:10 +0100

> 

> diff --git a/gdb/NEWS b/gdb/NEWS

> index d001a03145d..fd1952a2f59 100644

> --- a/gdb/NEWS

> +++ b/gdb/NEWS

> @@ -32,6 +32,12 @@ maint show internal-warning backtrace

>    internal-error, or an internal-warning.  This is on by default for

>    internal-error and off by default for internal-warning.

>  

> +set style disassembly on|off

> +show style disassembly

> +  If GDB is compiled with Python support, and the Python pygments

> +  module is available, then, when this setting is on, disassembler

> +  output will have styling applied.


If this requires Python with a module that is not available by
default, I think a general style name like "disassembly" would be
misleading.  I suggest "pygment-disassembly" instead.

> +@item set style disassembly @samp{on|off}

> +Enable or disable disassembly styling.  This affects whether

> +disassembly output, such as the output of the @code{disassemble}

> +command, is styled.  The default is @samp{on}.  Note that disassembly

> +styling only works if styling in general is enabled, and if a source

> +highlighting library is available to @value{GDBN}.

> +

> +To highlight disassembly output @value{GDBN} must be compiled with

> +Python support, and the Python Pygments package must be available,


So what does the default ON setting mean if pygments module is not
available, or if GDB was not compiled with Python support?
> +@node Disassembly In Python

> +@cindex Python Instruction Disassembly


Index entries should begin with a lower-case letter, so that sorting
of the entries in the produced manual would not depend on the locale.

> +@defivar DisassembleInfo can_emit_style_escapes

> +This is @code{True} if the output stream that the disassembler is

> +currently printing too can support escape sequences use for colors,

                      ^^^
Should be "to".

> +otherwise this attribute is @code{False}.


Not sure why you are talking about escape sequences: we support
styling with colors also on terminals without escape sequences.  Does
this mean this feature _must_ have actual escape sequence support?

> +@defmethod DisassembleInfo memory_error (offset)

> +This method marks the @code{DisassembleInfo} as having experienced a

> +@code{gdb.MemoryError} when trying to access memory of @var{offset}

> +bytes from @code{DisassembleInfo.address}.


Should this text have a cross-reference to where MemoryError is
described?

> +The optional @var{architecture} is either a string, or the value

> +@code{None}.  If it is a string, then it should be the name of an

> +architecture known to @value{GDBN}, as returned either from

> +@code{gdb.Architecture.name()}

> +(@pxref{gdbpy_architecture_name,,gdb.Architecture.name}), or from

> +@code{gdb.architecture_names()}

> +(@pxref{gdb_architecture_names,,gdb.architecture_names}).


Please remove the parentheses from the references to these methods.

> +@defun format_address (architecture, address)

> +Returns @var{address} formatted as a string, in a style suitable for

> +including in the disassembly output of an instruction, for example a

> +formatted address might look like:

> +

> +@smallexample

> +0x00001042 <symbol+16>

> +@end smallexample

> +

> +@var{architecture} is a @code{gdb.Architecture} (@pxref{Architectures

> +In Python}), which is required to format the addresses correctly.

> +This can be obtained from @code{DisassembleInfo.architecture}.


This last paragraph should have @noindent before it, since it's a
continuation the description of format_address.

> +After calling this function the result in @var{info} @emph{might} have

> +been updated to include syntax highlighting escape sequences.  If

> +syntax highlighting is disabled in @value{GDBN}, or the output stream

> +doesn't support syntax highlighting, then this function will leave

> +@var{info} unchanged.


I suggest a cross-reference to commands that enable syntax
highlighting where you mention it.

> +This function should return a Python object that supports the buffer

> +protocol, i.e. a string, an array, or the object returned from


Please add @: after i.e., to prevent TeX from typesetting that as an
end of a sentence.

Thanks.
Kuan-Ying Lee via Gdb-patches Oct. 22, 2021, 1:30 p.m. | #2
Hi Andrew,

I don't have time to read all the code, so I'll just nit-pick on the
public API.

On 2021-10-13 17:59, Andrew Burgess wrote:
> This commit extends the Python API to include disassembler support,

> and additionally provides a syntax highlighting disassembler.

> 

> The motivation for this commit was to provide an API by which the user

> could write Python scripts that would augment the output of the

> disassembler.

> 

> To achieve this I have followed the model of the existing libopcodes

> disassembler, that is, instructions are disassembled one by one.  This

> does restrict the type of things that it is possible to do from a

> Python script, i.e. all additional output has to fit on a single line,

> but this was all I needed, and creating something more complex would,

> I think, require greater changes to how GDB's internal disassembler

> operates.

> 

> It was only once I had a working prototype that I realised I could

> very easily use this to perform syntax highlighting on GDB's

> disassembly output, so I've included that too.  The new commands added

> are:

> 

>   set style disassembly on|off

>   show style disassembly

> 

> which enable or disable disassembly syntax highlighting.

> 

> The disassembler API is contained in the new gdb.disassembler module,

> which defines the following classes:

> 

>   DisassembleInfo

> 

>       Similar to libopcodes disassemble_info structure, has read-only

>   attributes: address, string, length, architecture, and

>   can_emit_style_escape.  And has methods: read_memory, set_result,

>   and memory_error.

> 

>       Each time GDB wants an instruction disassembled, an instance of

>   this class is passed to a user written disassembler, by reading the

>   attributes, and calling the methods, the user can perform

>   disassembly, and set the result within the DisassembleInfo instance.

> 

>   Disassembler

> 

>       This is a base-class which user written disassemblers should

>   inherit from, just provides base implementations of __init__ and

>   __call__ which the user written disassembler should override.

> 

> The gdb.disassembler module also provides the following functions:

> 

>   register_disassembler

> 

>       This function registers an instance of a Disassembler sub-class

>   as a disassembler, either for one specific architecture, or, as a

>   global disassembler for all architectures.

> 

>   format_address

> 

>       This wraps GDB's print_address function, converting an address

>   into a string that can be placed into disassembler output.

> 

>   syntax_highlight

> 

>       This adds syntax highlighting escapes to some disassembler

>   output, users can call this from their own custom disassemblers to

>   retain syntax highlighting, this function handles switching syntax

>   highlighting off, or the case where the pygments library is not

>   available.

> 

>   builtin_disassemble

> 

>       This provides access to GDB's builtin disassembler.  A common

>   user case that I see is augmenting the existing disassembler

>   output.  The user code can call this function to have GDB

>   disassemble the instruction in the normal way, and then the user can

>   tweak the output before returning that as the result.  This function

>   also provides a mechanism to intercept the disassemblers reads of

>   memory, thus the user can adjust what GDB sees when it is

>   disassembling.

> 

> The included documentation provides a more detailed description of the

> API.

> ---

>  gdb/Makefile.in                        |   1 +

>  gdb/NEWS                               |  42 ++

>  gdb/data-directory/Makefile.in         |   1 +

>  gdb/disasm.c                           |   5 +-

>  gdb/disasm.h                           |  13 +-

>  gdb/doc/gdb.texinfo                    |  14 +

>  gdb/doc/python.texi                    | 252 +++++++

>  gdb/python/lib/gdb/disassembler.py     | 194 ++++++

>  gdb/python/py-arch.c                   |   9 +

>  gdb/python/py-disasm.c                 | 905 +++++++++++++++++++++++++

>  gdb/python/python-internal.h           |  21 +

>  gdb/python/python.c                    |  11 +-

>  gdb/testsuite/gdb.base/style.exp       |  45 +-

>  gdb/testsuite/gdb.python/py-disasm.c   |  25 +

>  gdb/testsuite/gdb.python/py-disasm.exp | 201 ++++++

>  gdb/testsuite/gdb.python/py-disasm.py  | 538 +++++++++++++++

>  16 files changed, 2267 insertions(+), 10 deletions(-)

>  create mode 100644 gdb/python/lib/gdb/disassembler.py

>  create mode 100644 gdb/python/py-disasm.c

>  create mode 100644 gdb/testsuite/gdb.python/py-disasm.c

>  create mode 100644 gdb/testsuite/gdb.python/py-disasm.exp

>  create mode 100644 gdb/testsuite/gdb.python/py-disasm.py

> 

> diff --git a/gdb/Makefile.in b/gdb/Makefile.in

> index ec5d332c145..3981cc9507c 100644

> --- a/gdb/Makefile.in

> +++ b/gdb/Makefile.in

> @@ -392,6 +392,7 @@ SUBDIR_PYTHON_SRCS = \

>  	python/py-breakpoint.c \

>  	python/py-cmd.c \

>  	python/py-continueevent.c \

> +	python/py-disasm.c \

>  	python/py-event.c \

>  	python/py-evtregistry.c \

>  	python/py-evts.c \

> diff --git a/gdb/NEWS b/gdb/NEWS

> index d001a03145d..fd1952a2f59 100644

> --- a/gdb/NEWS

> +++ b/gdb/NEWS

> @@ -32,6 +32,12 @@ maint show internal-warning backtrace

>    internal-error, or an internal-warning.  This is on by default for

>    internal-error and off by default for internal-warning.

>  

> +set style disassembly on|off

> +show style disassembly

> +  If GDB is compiled with Python support, and the Python pygments

> +  module is available, then, when this setting is on, disassembler

> +  output will have styling applied.

> +

>  * Python API

>  

>    ** New function gdb.add_history(), which takes a gdb.Value object

> @@ -49,6 +55,42 @@ maint show internal-warning backtrace

>       containing all of the possible Architecture.name() values.  Each

>       entry is a string.

>  

> +  ** New Python API for wrapping GDB's disassembler:

> +

> +     - gdb.disassembler.register_disassembler(DISASSEMBLER, ARCH).

> +       DISASSEMBLER is a sub-class of gdb.disassembler.Disassembler.

> +       ARCH is either None or a string containing a bfd architecture

> +       name.  DISASSEMBLER is registered as a disassembler for

> +       architecture ARCH, or for all architectures if ARCH is None.

> +       The previous disassembler registered for ARCH is returned, this

> +       can be None if no previous disassembler was registered.

> +

> +     - gdb.disassembler.Disassembler is the class from which all

> +       disassemblers should inherit.  Its constructor takes a string,

> +       a name for the disassembler, which is currently only used is

> +       some debug output.  Sub-classes should override the __call__

> +       method to perform disassembly, invoking __call__ on this base

> +       class will raise an exception.

> +

> +     - gdb.disassembler.DisassembleInfo is the class used to describe

> +       a single disassembly request from GDB.  An instace of this


instace -> instance

> +       class is passed to the __call__ method of

> +       gdb.disassembler.Disassembler and has the following read-only

> +       attributes: 'address', 'string', 'length', 'architecture',

> +       'can_emit_style_escape', and the following methods

> +       'read_memory', 'set_result', and 'memory error'.


Just wondering, why having a "set_result" method instead of just having
the __call__ method return something?

You probably mean 'memory_error' instead of 'memory error'.  But can you
explain when you expect users to manually call "memory_error"?  I would
expect that calling read_memory may raise a gdb.MemoryError, but when
would the user manually generate a memory error?

And regardless of the above, I think it would be more Pythonic to have
the user raise an exception to signal an error, instead of calling a
method.  I'm not sure I understand the use case of calling set_result
and / or memory_error more than once, and have one overwrite the other.

> +

> +     - gdb.disassembler.format_address(ARCHITECTURE, ADDRESS), formats

> +       an address into a string so that the string can be included in

> +       the disassembler output.  ARCHITECTURE is a gdb.Architecture

> +       object.


Would it make sense to have that as a
"gdb.Architecture.format_address(ADDRESS)" method instead?  I'm thinking
that you might want to use this in other contexts than disassembly.  You
could always use gdb.disassembly.format_address anyway, but it would be
weird to use the gdb.disassembly module for something not
disassembly-related.

Simon
Andrew Burgess Oct. 22, 2021, 5:47 p.m. | #3
Thanks for the feedback.  I have a couple of questions:

* Eli Zaretskii <eliz@gnu.org> [2021-10-14 10:12:45 +0300]:

> > From: Andrew Burgess <andrew.burgess@embecosm.com>

> > Date: Wed, 13 Oct 2021 22:59:10 +0100

> > 

> > diff --git a/gdb/NEWS b/gdb/NEWS

> > index d001a03145d..fd1952a2f59 100644

> > --- a/gdb/NEWS

> > +++ b/gdb/NEWS

> > @@ -32,6 +32,12 @@ maint show internal-warning backtrace

> >    internal-error, or an internal-warning.  This is on by default for

> >    internal-error and off by default for internal-warning.

> >  

> > +set style disassembly on|off

> > +show style disassembly

> > +  If GDB is compiled with Python support, and the Python pygments

> > +  module is available, then, when this setting is on, disassembler

> > +  output will have styling applied.

> 

> If this requires Python with a module that is not available by

> default, I think a general style name like "disassembly" would be

> misleading.  I suggest "pygment-disassembly" instead.


I'm not sure I agree with this.  I don't see why we'd want to leak an
implementation detail (that we're using the pygment library) in a
setting name.

Surely, the setting should reflect what the effect is within GDB, and,
as far as possible, the implementation should be hidden from the
user. I'm hoping that some of the clarifications below might make this
more palatable...

> 

> > +@item set style disassembly @samp{on|off}

> > +Enable or disable disassembly styling.  This affects whether

> > +disassembly output, such as the output of the @code{disassemble}

> > +command, is styled.  The default is @samp{on}.  Note that disassembly

> > +styling only works if styling in general is enabled, and if a source

> > +highlighting library is available to @value{GDBN}.

> > +

> > +To highlight disassembly output @value{GDBN} must be compiled with

> > +Python support, and the Python Pygments package must be available,

> 

> So what does the default ON setting mean if pygments module is not

> available, or if GDB was not compiled with Python support?


You're correct, I've reworded this to reflect what actually happens.

First, as this is all implemented in Python, if GDB is compiled
without Python support then this setting (and the underlying feature)
is just not available.

If we do have Python, but not the Pygments library, then this feature
will be off by default, and an attempt to turn it on will give an
error that informs the user that the Python Pygments library is
missing.

Finally, if all the bits are in place, then this feature is on by
default.

> > +@node Disassembly In Python

> > +@cindex Python Instruction Disassembly

> 

> Index entries should begin with a lower-case letter, so that sorting

> of the entries in the produced manual would not depend on the locale.

> 

> > +@defivar DisassembleInfo can_emit_style_escapes

> > +This is @code{True} if the output stream that the disassembler is

> > +currently printing too can support escape sequences use for colors,

>                       ^^^

> Should be "to".

> 

> > +otherwise this attribute is @code{False}.

> 

> Not sure why you are talking about escape sequences: we support

> styling with colors also on terminals without escape sequences.  Does

> this mean this feature _must_ have actual escape sequence support?


I took the name from the internal GDB functions that do the same
check.  Can you point me at the terminal that does syntax highlighting
without using escape sequences, then I can see how this hooks back
into GDB.

Maybe I should rename this function 'supports_styling'? or something
similar?

Everything else I've fixed in my local tree.

Thanks,
Andrew


> 

> > +@defmethod DisassembleInfo memory_error (offset)

> > +This method marks the @code{DisassembleInfo} as having experienced a

> > +@code{gdb.MemoryError} when trying to access memory of @var{offset}

> > +bytes from @code{DisassembleInfo.address}.

> 

> Should this text have a cross-reference to where MemoryError is

> described?

> 

> > +The optional @var{architecture} is either a string, or the value

> > +@code{None}.  If it is a string, then it should be the name of an

> > +architecture known to @value{GDBN}, as returned either from

> > +@code{gdb.Architecture.name()}

> > +(@pxref{gdbpy_architecture_name,,gdb.Architecture.name}), or from

> > +@code{gdb.architecture_names()}

> > +(@pxref{gdb_architecture_names,,gdb.architecture_names}).

> 

> Please remove the parentheses from the references to these methods.

> 

> > +@defun format_address (architecture, address)

> > +Returns @var{address} formatted as a string, in a style suitable for

> > +including in the disassembly output of an instruction, for example a

> > +formatted address might look like:

> > +

> > +@smallexample

> > +0x00001042 <symbol+16>

> > +@end smallexample

> > +

> > +@var{architecture} is a @code{gdb.Architecture} (@pxref{Architectures

> > +In Python}), which is required to format the addresses correctly.

> > +This can be obtained from @code{DisassembleInfo.architecture}.

> 

> This last paragraph should have @noindent before it, since it's a

> continuation the description of format_address.

> 

> > +After calling this function the result in @var{info} @emph{might} have

> > +been updated to include syntax highlighting escape sequences.  If

> > +syntax highlighting is disabled in @value{GDBN}, or the output stream

> > +doesn't support syntax highlighting, then this function will leave

> > +@var{info} unchanged.

> 

> I suggest a cross-reference to commands that enable syntax

> highlighting where you mention it.

> 

> > +This function should return a Python object that supports the buffer

> > +protocol, i.e. a string, an array, or the object returned from

> 

> Please add @: after i.e., to prevent TeX from typesetting that as an

> end of a sentence.

> 

> Thanks.
Kuan-Ying Lee via Gdb-patches Oct. 22, 2021, 6:33 p.m. | #4
> Date: Fri, 22 Oct 2021 18:47:09 +0100

> From: Andrew Burgess <andrew.burgess@embecosm.com>

> Cc: gdb-patches@sourceware.org

> 

> > > +set style disassembly on|off

> > > +show style disassembly

> > > +  If GDB is compiled with Python support, and the Python pygments

> > > +  module is available, then, when this setting is on, disassembler

> > > +  output will have styling applied.

> > 

> > If this requires Python with a module that is not available by

> > default, I think a general style name like "disassembly" would be

> > misleading.  I suggest "pygment-disassembly" instead.

> 

> I'm not sure I agree with this.  I don't see why we'd want to leak an

> implementation detail (that we're using the pygment library) in a

> setting name.

> 

> Surely, the setting should reflect what the effect is within GDB, and,

> as far as possible, the implementation should be hidden from the

> user. I'm hoping that some of the clarifications below might make this

> more palatable...


It's just confusing to have a command that doesn't work without Python
to have a name that doesn't somehow hint on Python being required.
Especially since we have a lot of "set style" commands that don't
require Python.

> > Not sure why you are talking about escape sequences: we support

> > styling with colors also on terminals without escape sequences.  Does

> > this mean this feature _must_ have actual escape sequence support?

> 

> I took the name from the internal GDB functions that do the same

> check.  Can you point me at the terminal that does syntax highlighting

> without using escape sequences, then I can see how this hooks back

> into GDB.


The MS-Windows console is the one example I know about.

> Maybe I should rename this function 'supports_styling'? or something

> similar?


Yes, I think that'd be better.

Thanks.

Patch

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index ec5d332c145..3981cc9507c 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -392,6 +392,7 @@  SUBDIR_PYTHON_SRCS = \
 	python/py-breakpoint.c \
 	python/py-cmd.c \
 	python/py-continueevent.c \
+	python/py-disasm.c \
 	python/py-event.c \
 	python/py-evtregistry.c \
 	python/py-evts.c \
diff --git a/gdb/NEWS b/gdb/NEWS
index d001a03145d..fd1952a2f59 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -32,6 +32,12 @@  maint show internal-warning backtrace
   internal-error, or an internal-warning.  This is on by default for
   internal-error and off by default for internal-warning.
 
+set style disassembly on|off
+show style disassembly
+  If GDB is compiled with Python support, and the Python pygments
+  module is available, then, when this setting is on, disassembler
+  output will have styling applied.
+
 * Python API
 
   ** New function gdb.add_history(), which takes a gdb.Value object
@@ -49,6 +55,42 @@  maint show internal-warning backtrace
      containing all of the possible Architecture.name() values.  Each
      entry is a string.
 
+  ** New Python API for wrapping GDB's disassembler:
+
+     - gdb.disassembler.register_disassembler(DISASSEMBLER, ARCH).
+       DISASSEMBLER is a sub-class of gdb.disassembler.Disassembler.
+       ARCH is either None or a string containing a bfd architecture
+       name.  DISASSEMBLER is registered as a disassembler for
+       architecture ARCH, or for all architectures if ARCH is None.
+       The previous disassembler registered for ARCH is returned, this
+       can be None if no previous disassembler was registered.
+
+     - gdb.disassembler.Disassembler is the class from which all
+       disassemblers should inherit.  Its constructor takes a string,
+       a name for the disassembler, which is currently only used is
+       some debug output.  Sub-classes should override the __call__
+       method to perform disassembly, invoking __call__ on this base
+       class will raise an exception.
+
+     - gdb.disassembler.DisassembleInfo is the class used to describe
+       a single disassembly request from GDB.  An instace of this
+       class is passed to the __call__ method of
+       gdb.disassembler.Disassembler and has the following read-only
+       attributes: 'address', 'string', 'length', 'architecture',
+       'can_emit_style_escape', and the following methods
+       'read_memory', 'set_result', and 'memory error'.
+
+     - gdb.disassembler.format_address(ARCHITECTURE, ADDRESS), formats
+       an address into a string so that the string can be included in
+       the disassembler output.  ARCHITECTURE is a gdb.Architecture
+       object.
+
+     - gdb.disassembler.builtin_disassemble(INFO, MEMORY_SOURCE),
+       calls GDB's builtin disassembler on INFO, which is a
+       gdb.disassembler.DisassembleInfo object.  MEMORY_SOURCE is
+       optional, its default value is None.  If MEMORY_SOURCE is not
+       None then it must be an object that has a 'read_memory' method.
+
 *** Changes in GDB 11
 
 * The 'set disassembler-options' command now supports specifying options
diff --git a/gdb/data-directory/Makefile.in b/gdb/data-directory/Makefile.in
index 888325f974e..775516a53cc 100644
--- a/gdb/data-directory/Makefile.in
+++ b/gdb/data-directory/Makefile.in
@@ -69,6 +69,7 @@  PYTHON_DIR = python
 PYTHON_INSTALL_DIR = $(DESTDIR)$(GDB_DATADIR)/$(PYTHON_DIR)
 PYTHON_FILE_LIST = \
 	gdb/__init__.py \
+	gdb/disassembler.py \
 	gdb/FrameDecorator.py \
 	gdb/FrameIterator.py \
 	gdb/frames.py \
diff --git a/gdb/disasm.c b/gdb/disasm.c
index 0c384c778f5..3a0a11ec3bb 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -752,12 +752,13 @@  get_all_disassembler_options (struct gdbarch *gdbarch)
 
 gdb_disassembler::gdb_disassembler (struct gdbarch *gdbarch,
 				    struct ui_file *file,
-				    di_read_memory_ftype read_memory_func)
+				    di_read_memory_ftype read_memory_func,
+				    di_memory_error_ftype memory_error_func)
   : m_gdbarch (gdbarch)
 {
   init_disassemble_info (&m_di, file, dis_asm_fprintf);
   m_di.flavour = bfd_target_unknown_flavour;
-  m_di.memory_error_func = dis_asm_memory_error;
+  m_di.memory_error_func = memory_error_func;
   m_di.print_address_func = dis_asm_print_address;
   /* NOTE: cagney/2003-04-28: The original code, from the old Insight
      disassembler had a local optimization here.  By default it would
diff --git a/gdb/disasm.h b/gdb/disasm.h
index f6de33e3db8..eca116c98f8 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -41,6 +41,7 @@  struct ui_file;
 class gdb_disassembler
 {
   using di_read_memory_ftype = decltype (disassemble_info::read_memory_func);
+  using di_memory_error_ftype = decltype (disassemble_info::memory_error_func);
 
 public:
   gdb_disassembler (struct gdbarch *gdbarch, struct ui_file *file)
@@ -59,11 +60,21 @@  class gdb_disassembler
 
 protected:
   gdb_disassembler (struct gdbarch *gdbarch, struct ui_file *file,
-		    di_read_memory_ftype func);
+		    di_read_memory_ftype read_memory_func)
+    : gdb_disassembler (gdbarch, file, read_memory_func,
+			dis_asm_memory_error)
+  { /* Nothing.  */ }
+
+  gdb_disassembler (struct gdbarch *gdbarch, struct ui_file *file,
+		    di_read_memory_ftype read_memory_func,
+		    di_memory_error_ftype memory_error_func);
 
   struct ui_file *stream ()
   { return (struct ui_file *) m_di.stream; }
 
+  struct disassemble_info *disasm_info ()
+  { return &m_di; }
+
 private:
   struct gdbarch *m_gdbarch;
 
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 631a7c03b31..9af415cc018 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -26071,6 +26071,20 @@ 
 
 @item show style sources
 Show the current state of source code styling.
+
+@item set style disassembly @samp{on|off}
+Enable or disable disassembly styling.  This affects whether
+disassembly output, such as the output of the @code{disassemble}
+command, is styled.  The default is @samp{on}.  Note that disassembly
+styling only works if styling in general is enabled, and if a source
+highlighting library is available to @value{GDBN}.
+
+To highlight disassembly output @value{GDBN} must be compiled with
+Python support, and the Python Pygments package must be available,
+
+@item show style disassembly
+Show the current state of disassembly styling.
+
 @end table
 
 Subcommands of @code{set style} control specific forms of styling.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index 04192f906c8..808934aea73 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -221,6 +221,7 @@ 
 * Architectures In Python::     Python representation of architectures.
 * Registers In Python::         Python representation of registers.
 * TUI Windows In Python::       Implementing new TUI windows.
+* Disassembly In Python::       Instruction Disassembly In Python
 @end menu
 
 @node Basic Python
@@ -557,6 +558,7 @@ 
 related prompts are prohibited from being changed.
 @end defun
 
+@anchor{gdb_architecture_names}
 @defun gdb.architecture_names ()
 Return a list containing all of the architecture names that the
 current build of @value{GDBN} supports.  Each architecture name is a
@@ -3136,6 +3138,7 @@ 
 particular frame (@pxref{Frames In Python}).
 @end defun
 
+@anchor{gdbpy_inferior_read_memory}
 @findex Inferior.read_memory
 @defun Inferior.read_memory (address, length)
 Read @var{length} addressable memory units from the inferior, starting at
@@ -6075,6 +6078,255 @@ 
 2 (middle), or 3 (right).
 @end defun
 
+@node Disassembly In Python
+@cindex Python Instruction Disassembly
+@subsubsection Instruction Disassembly In Python
+
+@value{GDBN}'s builtin disassembler can be extended, or even replaced,
+using the Python API.  The disassembler related features are contained
+within the @code{gdb.disassembler} module:
+
+@deftp {class} DisassembleInfo
+Disassembly is driven by instances of this class.  Each time
+@value{GDBN} needs to disassemble an instruction, an instance of this
+class is created and passed to a registered disassembler.  The
+disassembler is then responsible for disassembling an instruction and
+storing the result within the instance of this class.  The following
+attributes and methods are available:
+
+@defivar DisassembleInfo address
+An integer containing the address at which @value{GDBN} wishes to
+disassemble a single instruction.
+@end defivar
+
+@defivar DisassembleInfo string
+A string that is the result of the disassembly.  If no result has yet
+been set then this field contains @code{None}.
+@end defivar
+
+@defivar DisassembleInfo length
+An integer that is the length of the disassembled instruction in
+bytes, or @code{None} if no result has yet been set for this
+instruction.
+
+When a result has been set then the length will always be a non-zero
+positive integer.
+@end defivar
+
+@defivar DisassembleInfo architecture
+The @code{gdb.Architecture} (@pxref{Architectures In Python}) for
+which @value{GDBN} is currently disassembling.
+@end defivar
+
+@defivar DisassembleInfo can_emit_style_escapes
+This is @code{True} if the output stream that the disassembler is
+currently printing too can support escape sequences use for colors,
+otherwise this attribute is @code{False}.
+@end defivar
+
+@defmethod DisassembleInfo read_memory (length, offset)
+This method allows the disassembler to read the bytes of the
+instruction to be disassembled.  The method reads @var{length} bytes,
+starting at @var{offset} from @code{DisassembleInfo.address}.
+
+It is important that the disassembler read the instruction bytes using
+this method, rather than reading inferior memory directly, as in some
+cases @value{GDBN} disassembles from an internal buffer rather than
+directly from inferior memory.
+
+Returns a buffer object, which behaves much like an array or a string,
+just as @code{Inferior.read_memory} does
+(@pxref{gdbpy_inferior_read_memory,,Inferior.read_memory}).
+@end defmethod
+
+@defmethod DisassembleInfo set_result (length, string)
+This method is used to set the result after an instruction has
+successfully been disassembled.  The @var{length} is the length in
+bytes of the instruction, and @var{string} is the text that should be
+displayed for the disassembled output.
+
+The @var{length} must be greater than zero, and @var{string} must be a
+non-empty string.
+
+It is valid to call this method multiple times during the disassembly
+of a single instruction, each call replaces the previous result.  In
+this way it is possible to extend the output of a previous
+disassembler.
+
+If @code{DisassembleInfo.memory_error} has previously been called,
+then calling @code{DisassembleInfo.set_result} clears the memory error
+from this @code{DisassembleInfo}.
+@end defmethod
+
+@defmethod DisassembleInfo memory_error (offset)
+This method marks the @code{DisassembleInfo} as having experienced a
+@code{gdb.MemoryError} when trying to access memory of @var{offset}
+bytes from @code{DisassembleInfo.address}.
+
+It is valid to call @code{DisassembleInfo.memory_error} multiple times
+for a single instruction disassembly, but only the first memory error
+is recorded.
+
+If @code{DisassembleInfo.set_result} has already been called, then any
+result is discarded when @code{DisassembleInfo.memory_error} is
+called.
+@end defmethod
+@end deftp
+
+@deftp {class} Disassembler
+This is a base class from which all user implemented disassemblers
+must inherit.
+
+@defmethod Disassembler __init__ (name)
+The constructor takes @var{name}, a string, which should be a short
+name for this disassembler.  Currently, this name is only used in some
+debug output.
+@end defmethod
+
+@defmethod Disassembler __call__ (info)
+The @code{__call__} method must be overridden by sub-classes to
+perform disassembly.  Calling @code{__call__} on this base class will
+raise a @code{NotImplementedError} exception.
+
+The @var{info} argument is an instance of @code{DisassembleInfo}, and
+describes the instruction that @value{GDBN} wants disassembling.
+
+This function must return @code{None}.  If this function raises a
+@code{gdb.MemoryError} exception then @value{GDBN} will ignore the
+exception and fallback to using its builtin disassembler.  Raising any
+other exception is an error.
+@end defmethod
+@end deftp
+
+@defun register_disassembler (disassembler, architecture)
+The @var{disassembler} must be a sub-class of @code{Disassembler}.
+
+The optional @var{architecture} is either a string, or the value
+@code{None}.  If it is a string, then it should be the name of an
+architecture known to @value{GDBN}, as returned either from
+@code{gdb.Architecture.name()}
+(@pxref{gdbpy_architecture_name,,gdb.Architecture.name}), or from
+@code{gdb.architecture_names()}
+(@pxref{gdb_architecture_names,,gdb.architecture_names}).
+
+The @var{disassembler} will be installed for the architecture named by
+@var{architecture}, or if @var{architecture} is @code{None}, then
+@var{disassembler} will be installed as a global disassembler for use
+by all architectures.
+
+@value{GDBN} only records a single disassembler for each architecture,
+and a single global disassembler.  Calling
+@code{register_disassembler} for an architecture, or for the global
+disassembler, will replace any existing disassembler registered for
+that @var{architecture} value.  The previous disassembler is returned.
+
+When @value{GDBN} is looking for a disassembler to use, @value{GDBN}
+first looks for an architecture specific disassembler.  If none has
+been registered then @value{GDBN} looks for a global disassembler (one
+registered with @var{architecture} set to @code{None}).  Only one
+disassembler is called to perform disassembly, so, if there is both an
+architecture specific disassembler, and a global disassembler
+registered, it is the architecture specific disassembler that will be
+used.
+
+@value{GDBN} tracks the architecture specific, and global
+disassemblers separately, so it doesn't matter in which order
+disassemblers are created or registed, an architecture specific
+disassembler, if present, will always be used before a global
+disassembler.
+@end defun
+
+@defun format_address (architecture, address)
+Returns @var{address} formatted as a string, in a style suitable for
+including in the disassembly output of an instruction, for example a
+formatted address might look like:
+
+@smallexample
+0x00001042 <symbol+16>
+@end smallexample
+
+@var{architecture} is a @code{gdb.Architecture} (@pxref{Architectures
+In Python}), which is required to format the addresses correctly.
+This can be obtained from @code{DisassembleInfo.architecture}.
+@end defun
+
+@defun syntax_highlight (info)
+This function can be used to apply syntax highlighting to the result
+already held within @var{info}, a @code{DisassembleInfo}.
+
+After calling this function the result in @var{info} @emph{might} have
+been updated to include syntax highlighting escape sequences.  If
+syntax highlighting is disabled in @value{GDBN}, or the output stream
+doesn't support syntax highlighting, then this function will leave
+@var{info} unchanged.
+
+If @var{info} doesn't have a result set when this function is called
+then @var{info} will not be modified.
+
+This function returns @code{None}.
+@end defun
+
+@defun builtin_disassemble (info, memory_source)
+This function calls back into @value{GDBN}'s builtin disassembler to
+disassemble the instruction identified by @var{info}, an instance of
+@code{DisassembleInfo}.
+
+After calling this function, if the instruction disassembled
+successfully, then @var{info} will have been updated as though
+@code{Disassemble.set_result} had been called.  The results of the
+builtin disassembler can be examined by reading
+@code{DisassembleInfo.length} and @code{DisassembleInfo.string}.
+
+If the builtin disassembler fails then this function will raise a
+@code{gdb.MemoryError} exception.
+
+The optional @var{memory_source} argument has the default value of
+@code{None}, in which case, the builtin disassembler will read the
+instruction from memory in the normal way.
+
+If @var{memory_source} is not @code{None}, then it should be an
+instance of a class that implements the following method:
+
+@defmethod memory_source read_memory (length, offset)
+This method will be called by the builtin disassembler to fetch bytes
+of the instruction being disassembled.  @var{length} is the number of
+bytes to fetch, and @var{offset} is the offset from the address of the
+instruction being disassembled, this address is obtained from
+@code{DisassembleInfo.address}.
+
+This function should return a Python object that supports the buffer
+protocol, i.e. a string, an array, or the object returned from
+@code{DisassembleInfo.read_memory}.
+
+The length of the returned buffer @emph{must} be @var{length}
+otherwise a @code{ValueError} exception will be raised.
+
+Alternatively, this function can raise a @code{gdb.MemoryError}
+exception to indicate that the read failed, raising any other
+exception type is an error.
+@end defmethod
+@end defun
+
+Here is an example that registers a global disassembler.  The new
+disassembler invokes the builtin disassembler, and then adds a
+comment, @code{## Comment}, to each line of disassembly output, before
+finally applying syntax highlighting to the result:
+
+@smallexample
+class ExampleDisassembler(gdb.disassembler.Disassembler):
+    def __init__(self):
+        super(ExampleDisassembler, self).__init__("ExampleDisassembler")
+
+    def __call__(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        if info.string is not None:
+            tmp = info.string + "\t## Comment"
+            info.set_result(info.length, tmp)
+            gdb.disassembler.syntax_highlight(info)
+
+gdb.disassembler.register_disassembler(ExampleDisassembler())
+@end smallexample
+
 @node Python Auto-loading
 @subsection Python Auto-loading
 @cindex Python auto-loading
diff --git a/gdb/python/lib/gdb/disassembler.py b/gdb/python/lib/gdb/disassembler.py
new file mode 100644
index 00000000000..9cf247a89e7
--- /dev/null
+++ b/gdb/python/lib/gdb/disassembler.py
@@ -0,0 +1,194 @@ 
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+"""Disassembler related module."""
+
+import gdb
+import _gdb.disassembler
+
+from _gdb.disassembler import *
+
+# Module global dictionary of gdb.disassembler.Disassembler objects.
+# The keys of this dictionary are bfd architecture names, or the
+# special value None.
+#
+# When a request to disassemble comes in we first lookup the bfd
+# architecture name from the gdbarch, if that name exists in this
+# dictionary then we use that Disassembler object.
+#
+# If there's no architecture specific disassembler then we look for
+# the key None in this dictionary, and if that key exists, we use that
+# disassembler.
+_disassembly_registry = {}
+
+# Module global callback.  This is the entry point that GDB calls, but
+# only if this is a callable thing.
+#
+# Initially we set this to None, so GDB will not try to call into any
+# Python code.
+#
+# When Python disassemblers are registered into _disassembly_registry
+# then this will be set to something callable.
+_print_insn = None
+
+
+class Disassembler(object):
+    """A base class from which all user implemented disassemblers must
+    inherit."""
+
+    def __init__(self, name):
+        """Constructor.  Takes a name, which should be a string, which can be
+        used to identify this disassembler in diagnostic messages."""
+        self.name = name
+
+    def __call__(self, info):
+        """A default implementation of __call__.  All sub-classes must
+        override this method.  Calling this default implementation will throw
+        a NotImplementedError exception."""
+        raise NotImplementedError("Disassembler.__call__")
+
+
+def register_disassembler(disassembler, architecture=None):
+    """Register a disassembler.  DISASSEMBLER is a sub-class of
+    gdb.disassembler.Disassembler.  ARCHITECTURE is either None or a
+    string, the name of an architecture known to GDB.
+
+    DISASSEMBLER is registered as a disassmbler for ARCHITECTURE, or
+    all architectures when ARCHITECTURE is None.
+
+    Returns the previous disassembler registered with this
+    ARCHITECTURE value.
+    """
+
+    if not isinstance(disassembler, Disassembler) and disassembler is not None:
+        raise TypeError("disassembler should sub-class gdb.disassembler.Disassembler")
+
+    old = None
+    if architecture in _disassembly_registry:
+        old = _disassembly_registry[architecture]
+        del _disassembly_registry[architecture]
+    if disassembler is not None:
+        _disassembly_registry[architecture] = disassembler
+
+    global _print_insn
+    if len(_disassembly_registry) > 0:
+        _print_insn = _perform_disassembly
+    else:
+        _print_insn = None
+
+    return old
+
+
+def _lookup_disassembler(arch):
+    try:
+        name = arch.name()
+        if name is None:
+            return None
+        if name in _disassembly_registry:
+            return _disassembly_registry[name]
+        if None in _disassembly_registry:
+            return _disassembly_registry[None]
+        return None
+    except:
+        return None
+
+
+def _perform_disassembly(info):
+    disassembler = _lookup_disassembler(info.architecture)
+    if disassembler is None:
+        return None
+    return disassembler(info)
+
+
+class StyleDisassembly(gdb.Parameter):
+    def __init__(self):
+        super(StyleDisassembly, self).__init__(
+            "style disassembly", gdb.COMMAND_NONE, gdb.PARAM_BOOLEAN
+        )
+        self.value = True
+        self._pygments_module_available = True
+
+    def get_show_string(self, sval):
+        return 'Disassembly styling is "%s".' % sval
+
+    def get_set_string(self):
+        if not self._pygments_module_available and self.value:
+            self.value = False
+            return "Python pygments module is not available"
+        return ""
+
+    def failed_to_load_pygments(self):
+        self.value = False
+        self._pygments_module_available = False
+
+    def __bool__(self):
+        return self.value
+
+    def __nonzero__(self):
+        if self.value:
+            return 1
+        else:
+            return 0
+
+
+style_disassembly_param = StyleDisassembly()
+
+try:
+    from pygments import formatters, lexers, highlight
+
+    _lexer = lexers.get_lexer_by_name("asm")
+    _formatter = formatters.TerminalFormatter()
+
+    def syntax_highlight(info):
+        # If we should not be performing syntax highlighting, or if
+        # INFO does not hold a result, then there's nothing to do.
+        if (
+            not gdb.parameter("style enabled")
+            or not style_disassembly_param
+            or not info.can_emit_style_escape
+            or info.string is None
+        ):
+            return
+        # Now apply the highlighting, and update the result.
+        str = highlight(info.string, _lexer, _formatter)
+        info.set_result(info.length, str.strip())
+
+    class _SyntaxHighlightingDisassembler(Disassembler):
+        """A syntax highlighting disassembler."""
+
+        def __init__(self, name):
+            """Constructor."""
+            super(_SyntaxHighlightingDisassembler, self).__init__(name)
+
+        def __call__(self, info):
+            """Invoke the builtin disassembler, and syntax highlight the result."""
+            gdb.disassembler.builtin_disassemble(info)
+            gdb.disassembler.syntax_highlight(info)
+
+    register_disassembler(
+        _SyntaxHighlightingDisassembler("syntax_highlighting_disassembler")
+    )
+
+except:
+
+    # Update the 'set/show style disassembly' parameter now we know
+    # that the pygments module can't be loaded.
+    style_disassembly_param.failed_to_load_pygments()
+
+    def syntax_highlight(info):
+        # An implementation of syntax_highlight that can safely be
+        # called event when syntax highlighting is not available.
+        # This just returns, leaving INFO unmodified.
+        return
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 3e7970ab764..1855f3daab3 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -72,6 +72,15 @@  arch_object_to_gdbarch (PyObject *obj)
   return py_arch->gdbarch;
 }
 
+/* See python-internal.h.  */
+
+bool
+gdbpy_is_arch_object (PyObject *obj)
+{
+  gdb_assert (obj != nullptr);
+  return PyObject_TypeCheck (obj, &arch_object_type);
+}
+
 /* Returns the Python architecture object corresponding to GDBARCH.
    Returns a new reference to the arch_object associated as data with
    GDBARCH.  */
diff --git a/gdb/python/py-disasm.c b/gdb/python/py-disasm.c
new file mode 100644
index 00000000000..3327e532270
--- /dev/null
+++ b/gdb/python/py-disasm.c
@@ -0,0 +1,905 @@ 
+/* Python interface to instruction disassembly.
+
+   Copyright (C) 2008-2021 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "dis-asm.h"
+#include "arch-utils.h"
+#include "charset.h"
+#include "disasm.h"
+
+/* Implement gdb.disassembler.DisassembleInfo type.  An object of this type
+   represents a single disassembler request from GDB.  */
+
+struct disasm_info_object {
+  PyObject_HEAD
+
+  /* The architecture in which we are disassembling.  */
+  struct gdbarch *gdbarch;
+
+  /* Address of the instruction to disassemble.  */
+  bfd_vma address;
+
+  disassemble_info *gdb_info;
+  disassemble_info *py_info;
+
+  /* The length of the disassembled instruction, a value of -1 indicates
+     that there is no disassembly result set, otherwise, this should be a
+     value greater than zero.  */
+  int length;
+
+  /* A string buffer containing the disassembled instruction.  This is
+     initially nullptr, and is allocated when needed.  It is possible that
+     the length field (above) can be -1, but this buffer is still
+     allocated, this happens if the user first sets a result, and then
+     marks a memory error.  In this case any value in CONTENT should be
+     ignored.  */
+  string_file *content;
+
+  /* When the user indicates that a memory error has occurred then this
+     field is set to true, it is false by default.  */
+  bool memory_error_address_p;
+
+  /* When the user indicates that a memory error has occurred then the
+     address of the memory error is stored in here.  This field is only
+     valid when MEMORY_ERROR_ADDRESS_P is true, otherwise this field is
+     undefined.  */
+  CORE_ADDR memory_error_address;
+
+  /* When the user calls the builtin_disassembler function, if they pass a
+     memory source object then a pointer to the object is placed in here,
+     otherwise, this field is nullptr.  */
+  PyObject *memory_source;
+};
+
+extern PyTypeObject disasm_info_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("disasm_info_object");
+
+typedef int (*read_memory_ftype)
+    (bfd_vma memaddr, bfd_byte *myaddr, unsigned int length,
+     struct disassemble_info *dinfo);
+
+/* A sub-class of gdb_disassembler that holds a pointer to a Python
+   DisassembleInfo object.  A pointer to an instance of this class is
+   placed in the application_data field of the disassemble_info that is
+   used when we call gdbarch_print_insn.  */
+
+struct gdbpy_disassembler : public gdb_disassembler
+{
+  /* Constructor.  */
+  gdbpy_disassembler (struct gdbarch *gdbarch, struct ui_file *stream,
+		      disasm_info_object *obj);
+
+  /* Get the DisassembleInfo object pointer.  */
+  disasm_info_object *
+  py_disasm_info () const
+  {
+    return m_disasm_info_object;
+  }
+
+  /* Mark this class as a friend so that it can call the disasm_info
+     method, which is protected in our parent.  */
+  friend class scoped_disasm_info_object;
+
+private:
+  /* The DisassembleInfo object we are disassembling for.  */
+  disasm_info_object *m_disasm_info_object;
+};
+
+/* Return true if OBJ is still valid, otherwise, return false.  A valid OBJ
+   will have a non-nullptr gdb_info field.  */
+
+static bool
+disasmpy_info_is_valid (disasm_info_object *obj)
+{
+  if (obj->gdb_info == nullptr)
+    gdb_assert (obj->py_info == nullptr);
+  else
+    gdb_assert (obj->py_info != nullptr);
+
+  return obj->gdb_info != nullptr;
+}
+
+/* Ensure that a gdb.disassembler.DisassembleInfo is valid.  */
+#define DISASMPY_DISASM_INFO_REQUIRE_VALID(Info)			\
+  do {									\
+    if (!disasmpy_info_is_valid (Info))					\
+      {									\
+	PyErr_SetString (PyExc_RuntimeError,				\
+			 _("DisassembleInfo is no longer valid."));	\
+	return nullptr;							\
+      }									\
+  } while (0)
+
+/* Mark OBJ as having a memory error at ADDR.  Only the first memory error
+   is recorded, so if OBJ has already had a memory error set then this
+   call will have no effect.  */
+
+static void
+disasmpy_set_memory_error (disasm_info_object *obj, CORE_ADDR addr)
+{
+  if (!obj->memory_error_address_p)
+    {
+      obj->memory_error_address = addr;
+      obj->memory_error_address_p = true;
+    }
+}
+
+/* Clear any memory error already set on OBJ.  If there is no memory error
+   set on OBJ then this call has no effect.  */
+
+static void
+disasmpy_clear_memory_error (disasm_info_object *obj)
+{
+  obj->memory_error_address_p = false;
+}
+
+/* Clear any previous disassembler result stored within OBJ.  If there was
+   no previous disassembler result then calling this function has no
+   effect.  */
+
+static void
+disasmpy_clear_disassembler_result (disasm_info_object *obj)
+{
+  obj->length = -1;
+  gdb_assert (obj->content != nullptr);
+  obj->content->clear ();
+}
+
+/* Implement gdb.disassembler.builtin_disassemble().  Calls back into GDB's
+   builtin disassembler.  The first argument is a DisassembleInfo object
+   describing what to disassemble.  The second argument is optional and
+   provides a mechanism to modify the memory contents that the builtin
+   disassembler will actually disassemble.  Returns the Python None value.  */
+
+static PyObject *
+disasmpy_builtin_disassemble (PyObject *self, PyObject *args, PyObject *kw)
+{
+  PyObject *info_obj, *memory_source_obj = nullptr;
+  static const char *keywords[] = { "info", "memory_source", nullptr };
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O!|O", keywords,
+					&disasm_info_object_type, &info_obj,
+					&memory_source_obj))
+    return nullptr;
+
+  disasm_info_object *disasm_info = (disasm_info_object *) info_obj;
+  if (!disasmpy_info_is_valid (disasm_info))
+    {
+      PyErr_SetString (PyExc_RuntimeError,
+		       _("DisassembleInfo is no longer valid."));
+      return nullptr;
+    }
+
+  gdb::optional<scoped_restore_tmpl<PyObject *>> restore_memory_source;
+
+  disassemble_info *info = disasm_info->py_info;
+  if (memory_source_obj != nullptr)
+    {
+      if (!PyObject_HasAttrString (memory_source_obj, "read_memory"))
+	{
+	  PyErr_SetString (PyExc_TypeError,
+			   _("memory_source doesn't have a read_memory method"));
+	  return nullptr;
+	}
+
+      gdb_assert (disasm_info->memory_source == nullptr);
+      restore_memory_source.emplace (&disasm_info->memory_source,
+				     memory_source_obj);
+    }
+
+  /* When the user calls the builtin disassembler any previous result or
+     memory error is discarded, and we start fresh.  */
+  disasmpy_clear_disassembler_result (disasm_info);
+  disasmpy_clear_memory_error (disasm_info);
+
+  /* Now actually perform the disassembly.  */
+  disasm_info->length
+    = gdbarch_print_insn (disasm_info->gdbarch, disasm_info->address, info);
+
+  if (disasm_info->length == -1)
+    {
+      /* In an ideal world, every disassembler should always call the
+	 memory error function before returning a status of -1 as the only
+	 error a disassembler should encounter is a failure to read
+	 memory.  Unfortunately, there are some disassemblers who don't
+	 follow this rule, and will return -1 without calling the memory
+	 error function.
+
+	 To make the Python API simpler, we just classify everything as a
+	 memory error, but the message has to be modified for the case
+	 where the disassembler didn't call the memory error function.  */
+      if (disasm_info->memory_error_address_p)
+	{
+	  CORE_ADDR addr = disasm_info->memory_error_address;
+	  PyErr_Format (gdbpy_gdb_memory_error,
+			"failed to read memory at %s",
+			core_addr_to_string (addr));
+	}
+      else
+	PyErr_Format (gdbpy_gdb_memory_error, "failed to read memory");
+      return nullptr;
+    }
+
+  /* Instructions are either non-zero in length, or we got an error,
+     indicated by a length of -1, which we handled above.  */
+  gdb_assert (disasm_info->length > 0);
+
+  /* We should not have seen a memory error in this case.  */
+  gdb_assert (!disasm_info->memory_error_address_p);
+
+  Py_RETURN_NONE;
+}
+
+/* Implement DisassembleInfo.read_memory(LENGTH, OFFSET).  Read LENGTH
+   bytes at OFFSET from the start of the instruction currently being
+   disassembled, and return a memory buffer containing the bytes.
+
+   OFFSET defaults to zero if it is not provided.  LENGTH is required.  If
+   the read fails then this will raise a gdb.MemoryError exception.  */
+
+static PyObject *
+disasmpy_info_read_memory (PyObject *self, PyObject *args, PyObject *kw)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+
+  LONGEST length, offset = 0;
+  gdb::unique_xmalloc_ptr<gdb_byte> buffer;
+  static const char *keywords[] = { "length", "offset", nullptr };
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "L|L", keywords,
+					&length, &offset))
+    return nullptr;
+
+  /* The apparent address from which we are reading memory.  Note that in
+     some cases GDB actually disassembles instructions from a buffer, so
+     we might not actually be reading this information directly from the
+     inferior memory.  This is all hidden behind the read_memory_func API
+     within the disassemble_info structure.  */
+  CORE_ADDR address = obj->address + offset;
+
+  /* Setup a buffer to hold the result.  */
+  buffer.reset ((gdb_byte *) xmalloc (length));
+
+  /* Read content into BUFFER.  If the read fails then raise a memory
+     error, otherwise, convert BUFFER to a Python memory buffer, and return
+     it to the user.  */
+  disassemble_info *info = obj->gdb_info;
+  if (info->read_memory_func ((bfd_vma) address, buffer.get (),
+			      (unsigned int) length, info) != 0)
+    {
+      PyErr_Format (gdbpy_gdb_memory_error,
+		    "failed to read %s bytes at %s",
+		    pulongest ((ULONGEST) length),
+		    core_addr_to_string (address));
+      return nullptr;
+    }
+  return gdbpy_buffer_to_membuf (std::move (buffer), address, length);
+}
+
+/* Implement DisassembleInfo.set_result(LENGTH, STRING).  Discard any
+   previous memory error and set the result of this disassembly to be
+   STRING, a LENGTH bytes long instruction.  The LENGTH must be greater
+   than zero otherwise a ValueError exception is raised.  STRING must be a
+   non-empty string, or a ValueError exception is raised.  */
+
+static PyObject *
+disasmpy_info_set_result (PyObject *self, PyObject *args, PyObject *kw)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+
+  static const char *keywords[] = { "length", "string", nullptr };
+  int length;
+  const char *string;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "is", keywords,
+					&length, &string))
+    return nullptr;
+
+  if (length <= 0)
+    {
+      PyErr_SetString (PyExc_ValueError,
+		       _("Length must be greater than 0."));
+      return nullptr;
+    }
+
+  size_t string_len = strlen (string);
+  if (string_len == 0)
+    {
+      PyErr_SetString (PyExc_ValueError, _("String must not be empty."));
+      return nullptr;
+    }
+
+  /* Discard any previously recorded memory error, and any previous
+     disassembler result.  */
+  disasmpy_clear_memory_error (obj);
+  disasmpy_clear_disassembler_result (obj);
+
+  /* And set the result.  */
+  obj->length = length;
+  gdb_assert (obj->content != nullptr);
+  obj->content->write (string, string_len);
+
+  Py_RETURN_NONE;
+}
+
+/* Implement DisassembleInfo.memory_error().  Mark SELF (a DisassembleInfo
+   object) as having a memory error.  Any previous result is discarded.  */
+
+static PyObject *
+disasmpy_info_memory_error (PyObject *self, PyObject *args, PyObject *kw)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+
+  static const char *keywords[] = { "offset", nullptr };
+  LONGEST offset;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "L", keywords,
+					&offset))
+    return nullptr;
+
+  /* Discard any previous disassembler result, and mark OBJ as having a
+     memory error.  */
+  disasmpy_clear_disassembler_result (obj);
+  disasmpy_set_memory_error (obj, obj->address + offset);
+
+  Py_RETURN_NONE;
+}
+
+/* Implement gdb.disassembler.format_address(ARCH, ADDR).  Formats ADDR, an
+   address and returns a string.  ADDR will be formatted in the style that
+   the disassembler uses: '0x.... <symbol + offset>'.  ARCH is a
+   gdb.Architecture used to perform the formatting.  */
+
+static PyObject *
+disasmpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "architecture", "address", nullptr };
+  PyObject *addr_obj, *arch_obj;
+  CORE_ADDR addr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "OO", keywords,
+					&arch_obj, &addr_obj))
+    return nullptr;
+
+  if (get_addr_from_python (addr_obj, &addr) < 0)
+    return nullptr;
+
+  if (!gdbpy_is_arch_object (arch_obj))
+    {
+      PyErr_SetString (PyExc_TypeError,
+		       _("architecture argument is not a gdb.Architecture"));
+      return nullptr;
+    }
+
+  gdbarch *gdbarch = arch_object_to_gdbarch (arch_obj);
+  if (gdbarch == nullptr)
+    {
+      PyErr_SetString (PyExc_RuntimeError,
+		       _("architecture argument is invalid."));
+      return nullptr;
+    }
+
+  string_file buf;
+  print_address (gdbarch, addr, &buf);
+  return PyString_FromString (buf.c_str ());
+}
+
+/* Implement DisassembleInfo.address attribute, return the address at which
+   GDB would like an instruction disassembled.  */
+
+static PyObject *
+disasmpy_info_address (PyObject *self, void *closure)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+  return gdb_py_object_from_longest (obj->address).release ();
+}
+
+/* Implement DisassembleInfo.string attribute.  Return a string containing
+   the current disassembly result, or None if there is no current
+   disassembly result.  */
+
+static PyObject *
+disasmpy_info_string (PyObject *self, void *closure)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+
+  gdb_assert (obj->content != nullptr);
+  if (strlen (obj->content->c_str ()) == 0)
+    Py_RETURN_NONE;
+  gdb_assert (obj->length > 0);
+  return PyUnicode_Decode (obj->content->c_str (),
+			   obj->content->size (),
+			   host_charset (), nullptr);
+}
+
+/* Implement DisassembleInfo.length attribute.  Return the length of the
+   current disassembled instruction, as set by a call to
+   DisassembleInfo.set_result.  If no result has been set yet, or if a call
+   to DisassembleInfo.memory_error has invalidated the result, then None is
+   returned.  */
+
+static PyObject *
+disasmpy_info_length (PyObject *self, void *closure)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+  if (obj->length == -1)
+    Py_RETURN_NONE;
+  gdb_assert (obj->length > 0);
+  gdb_assert (obj->content != nullptr);
+  gdb_assert (strlen (obj->content->c_str ()) > 0);
+  return gdb_py_object_from_longest (obj->length).release ();
+}
+
+/* Implement DisassembleInfo.architecture attribute.  Return the
+   gdb.Architecture in which we are disassembling.  */
+
+static PyObject *
+disasmpy_info_architecture (PyObject *self, void *closure)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+  return gdbarch_to_arch_object (obj->gdbarch);
+}
+
+/* Implement DisassembleInfo.can_emit_style_escape attribute.  Returns True
+   if the output stream that the disassembly result will be written too
+   supports style escapes, otherwise, returns False.  */
+
+static PyObject *
+disasmpy_info_can_emit_style_escape (PyObject *self, void *closure)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+  DISASMPY_DISASM_INFO_REQUIRE_VALID (obj);
+  bool can_emit_style_escape = current_uiout->can_emit_style_escape ();
+  return PyBool_FromLong (can_emit_style_escape ? 1 : 0);
+}
+
+/* This implements the disassemble_info read_memory_func callback.  This
+   will either call the standard read memory function, or, if the user has
+   supplied a memory source (see disasmpy_builtin_disassemble) then this
+   will call back into Python to obtain the memory contents.
+
+   Read LEN bytes from MEMADDR and place them into BUFF.  Return 0 on
+   success (in which case BUFF has been filled), or -1 on error, in which
+   case the contents of BUFF are undefined.  */
+
+static int
+disasmpy_read_memory_func (bfd_vma memaddr, gdb_byte *buff,
+			  unsigned int len, struct disassemble_info *info)
+{
+  gdbpy_disassembler *dis
+    = static_cast<gdbpy_disassembler *> (info->application_data);
+  disasm_info_object *obj = dis->py_disasm_info ();
+
+  /* The simple case, the user didn't pass a separate memory source, so we
+     just delegate to the standard disassemble_info read_memory_func.  */
+  if (obj->memory_source == nullptr)
+    return obj->gdb_info->read_memory_func (memaddr, buff, len, obj->gdb_info);
+
+  /* The user provided a separate memory source, we need to call the
+     read_memory method on the memory source and use the buffer it returns
+     as the bytes of memory.  */
+  PyObject *memory_source = obj->memory_source;
+  LONGEST offset = (LONGEST) memaddr - (LONGEST) obj->address;
+  gdbpy_ref<> result_obj (PyObject_CallMethod (memory_source, "read_memory",
+					       "KL", len, offset));
+  if (result_obj == nullptr)
+    {
+      /* If we got a gdb.MemoryError then we ignore this and just report
+	 that the read failed to the caller.  For any other exception type
+	 we assume this is a bug in the users code, print the stack, and
+	 then report the read failed.  */
+      if (PyErr_ExceptionMatches (gdbpy_gdb_memory_error))
+	PyErr_Clear ();
+      else
+	gdbpy_print_stack ();
+      return -1;
+    }
+
+  /* Convert the result to a buffer.  */
+  Py_buffer py_buff;
+  if (!PyObject_CheckBuffer (result_obj.get ())
+      || PyObject_GetBuffer (result_obj.get(), &py_buff, PyBUF_CONTIG_RO) < 0)
+    {
+      PyErr_Format (PyExc_TypeError,
+		    _("Result from read_memory is not a buffer"));
+      gdbpy_print_stack ();
+      return -1;
+    }
+
+  /* Wrap PY_BUFF so that it is cleaned up correctly at the end of this
+     scope.  */
+  Py_buffer_up buffer_up (&py_buff);
+
+  /* Validate that the buffer is the correct length.  */
+  if (py_buff.len != len)
+    {
+      PyErr_Format (PyExc_ValueError,
+		    _("Result from read_memory is incorrectly sized buffer"));
+      gdbpy_print_stack ();
+      return -1;
+    }
+
+  /* Copy the data out of the Python buffer and return succsess.*/
+  const gdb_byte *buffer = (const gdb_byte *) py_buff.buf;
+  memcpy (buff, buffer, len);
+  return 0;
+}
+
+/* Implement memory_error_func callback for disassemble_info.  Extract the
+   underlying DisassembleInfo Python object, and set a memory error on
+   it.  */
+
+static void
+disasmpy_memory_error_func (int status, bfd_vma memaddr,
+			   struct disassemble_info *info)
+{
+  gdbpy_disassembler *dis
+    = static_cast<gdbpy_disassembler *> (info->application_data);
+  disasm_info_object *obj = dis->py_disasm_info ();
+  disasmpy_set_memory_error (obj, memaddr);
+}
+
+/* Constructor.  */
+
+gdbpy_disassembler::gdbpy_disassembler (struct gdbarch *gdbarch,
+					struct ui_file *stream,
+					disasm_info_object *obj)
+  : gdb_disassembler (gdbarch, stream, disasmpy_read_memory_func,
+		      disasmpy_memory_error_func),
+    m_disasm_info_object (obj)
+{ /* Nothing.  */ }
+
+/* A wrapper around a reference to a Python DisassembleInfo object, along
+   with some supporting information that the DisassembleInfo object needs
+   to reference.
+
+   Each DisassembleInfo is created in gdbpy_print_insn, and is done with by
+   the time that function returns.  However, there's nothing to stop a user
+   caching a reference to the DisassembleInfo, and thus keeping the object
+   around.
+
+   We therefore have the notion of a DisassembleInfo becoming invalid, this
+   happens when gdbpy_print_insn returns.  This class is responsible for
+   marking the DisassembleInfo as invalid in its destructor.  */
+
+struct scoped_disasm_info_object
+{
+  /* Constructor.  */
+  scoped_disasm_info_object (struct gdbarch *gdbarch, CORE_ADDR memaddr,
+			 disassemble_info *info)
+    : m_disasm_info (allocate_disasm_info_object ()),
+      m_py_disassembler (gdbarch, &m_string_file, m_disasm_info.get ())
+  {
+    m_disasm_info->address = memaddr;
+    m_disasm_info->gdb_info = info;
+    m_disasm_info->py_info = m_py_disassembler.disasm_info ();
+    m_disasm_info->length = -1;
+    m_disasm_info->content = &m_string_file;
+    m_disasm_info->gdbarch = gdbarch;
+    m_disasm_info->memory_error_address_p = false;
+    m_disasm_info->memory_error_address = 0;
+    m_disasm_info->memory_source = nullptr;
+  }
+
+  /* Upon destruction clear pointers to state that will no longer be
+     valid.  These fields are checked in disasmpy_info_is_valid to see if
+     the disasm_info_object is still valid or not.  */
+  ~scoped_disasm_info_object ()
+  {
+    m_disasm_info->gdb_info = nullptr;
+    m_disasm_info->py_info = nullptr;
+    m_disasm_info->content = nullptr;
+  }
+
+  /* Return a pointer to the underlying disasm_info_object instance.  */
+  disasm_info_object *
+  get () const
+  {
+    return m_disasm_info.get ();
+  }
+
+private:
+
+  /* Wrapper around the call to PyObject_New, this wrapper function can be
+     called from the constructor initialization list, while PyObject_New, a
+     macro, can't.  */
+  static disasm_info_object *
+  allocate_disasm_info_object ()
+  {
+    return (disasm_info_object *) PyObject_New (disasm_info_object,
+						&disasm_info_object_type);
+  }
+
+  /* A reference to a gdb.disassembler.DisassembleInfo object.  When this
+     containing instance goes out of scope this reference is released,
+     however, the user might be holding other references to the
+     DisassembleInfo object in Python code, so the underlying object might
+     not be deleted.  */
+  gdbpy_ref<disasm_info_object> m_disasm_info;
+
+  /* A location into which the output of the Python disassembler is
+     collected.  We only send this back to GDB once the Python disassembler
+     has completed successfully.  */
+  string_file m_string_file;
+
+  /* Core GDB requires that the disassemble_info application_data field be
+     an instance of, or a sub-class or, gdb_disassembler.  We use a
+     sub-class so that functions within the file can obtain a pointer to
+     the disasm_info_object from the application_data.  */
+  gdbpy_disassembler m_py_disassembler;
+};
+
+/* See python-internal.h.  */
+
+gdb::optional<int>
+gdbpy_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
+		  disassemble_info *info)
+{
+  if (!gdb_python_initialized)
+    return {};
+
+  gdbpy_enter enter_py (get_current_arch (), current_language);
+
+  /* The attribute we are going to lookup that provides the print_insn
+     functionality.  */
+  static const char *callback_name = "_print_insn";
+
+  /* Grab a reference to the gdb.disassembler module, and check it has the
+     attribute that we need.  */
+  static gdbpy_ref<> gdb_python_disassembler_module
+    (PyImport_ImportModule ("gdb.disassembler"));
+  if (gdb_python_disassembler_module == nullptr
+      || !PyObject_HasAttrString (gdb_python_disassembler_module.get (),
+				  callback_name))
+    return {};
+
+  /* Now grab the callback attribute from the module, and check that it is
+     callable.  */
+  gdbpy_ref<> hook
+    (PyObject_GetAttrString (gdb_python_disassembler_module.get (),
+			     callback_name));
+  if (hook == nullptr)
+    {
+      gdbpy_print_stack ();
+      return {};
+    }
+  if (!PyCallable_Check (hook.get ()))
+    return {};
+
+  scoped_disasm_info_object scoped_disasm_info (gdbarch, memaddr, info);
+  disasm_info_object *disasm_info = scoped_disasm_info.get ();
+
+  /* Call into the registered disassembler to (possibly) perform the
+     disassembly.  */
+  PyObject *insn_disas_obj = (PyObject *) disasm_info;
+  gdbpy_ref<> result (PyObject_CallFunctionObjArgs (hook.get (),
+						    insn_disas_obj,
+						    nullptr));
+
+  if (result == nullptr)
+    {
+      if (PyErr_ExceptionMatches (gdbpy_gdb_memory_error))
+	{
+	  /* Uncaught memory errors are not printed, we assume that the
+	     user tried to read some bytes for their custom disassembler,
+	     but the bytes were no available, as such, we should silently
+	     fall back to using the builtin disassembler, which is what
+	     happens when we return no value here.  */
+	  PyErr_Clear ();
+	}
+      else
+	{
+	  /* Any other error while executing the _print_insn callback
+	     should result in a debug stack being printed, then we return
+	     no value to indicate that the builtin disassembler should be
+	     used.  */
+	  gdbpy_print_stack ();
+	}
+      return {};
+    }
+  else if (result != Py_None)
+    error (_("invalid return value from gdb.disassembler._print_insn"));
+
+  if (disasm_info->memory_error_address_p)
+    {
+      /* We pass -1 for the status here.  GDB doesn't make use of this
+	 field, but disassemblers usually pass the result of
+	 read_memory_func as the status, in which case -1 indicates an
+	 error.  */
+      bfd_vma addr = disasm_info->memory_error_address;
+      info->memory_error_func (-1, addr, info);
+      return gdb::optional<int> (-1);
+    }
+
+  /* If the gdb.disassembler.DisassembleInfo object doesn't have a result
+     then return false.  */
+  if (disasm_info->length == -1)
+    return {};
+
+  /* Print the content from the DisassembleInfo back through to GDB's
+     standard fprintf_func handler.  */
+  info->fprintf_func (info->stream, "%s", disasm_info->content->c_str ());
+
+  /* Return the length of this instruction.  */
+  return gdb::optional<int> (disasm_info->length);
+}
+
+/* The tp_dealloc callback for the DisassembleInfo type.  Takes care of
+   deallocating the content buffer.  */
+
+static void
+disasmpy_dealloc (PyObject *self)
+{
+  disasm_info_object *obj = (disasm_info_object *) self;
+
+  /* The memory_source field is only ever temporarily set to non-nullptr
+     during the disasmpy_builtin_disassemble function.  By the end of that
+     function the memory_source field should be back to nullptr.  */
+  gdb_assert (obj->memory_source == nullptr);
+
+  /* The content field will also be reset to nullptr by the end of
+     gdbpy_print_insn, so the following assert should hold.  */
+  gdb_assert (obj->content == nullptr);
+  Py_TYPE (self)->tp_free (self);
+}
+
+/* The get/set attributes of the gdb.disassembler.DisassembleInfo type.  */
+
+static gdb_PyGetSetDef disasm_info_object_getset[] = {
+  { "address", disasmpy_info_address, nullptr,
+    "Start address of the instruction to disassemble.", nullptr },
+  { "string", disasmpy_info_string, nullptr,
+    "String representing the disassembled instruction.", nullptr },
+  { "length", disasmpy_info_length, nullptr,
+    "Length in octets of the disassembled instruction.", nullptr },
+  { "architecture", disasmpy_info_architecture, nullptr,
+    "Architecture to disassemble in", nullptr },
+  { "can_emit_style_escape", disasmpy_info_can_emit_style_escape, nullptr,
+    "Boolean indicating if style escapes can be emitted", nullptr },
+  { nullptr }   /* Sentinel */
+};
+
+/* The methods of the gdb.disassembler.DisassembleInfo type.  */
+
+static PyMethodDef disasm_info_object_methods[] = {
+  { "read_memory", (PyCFunction) disasmpy_info_read_memory,
+    METH_VARARGS | METH_KEYWORDS,
+    "read_memory (LEN, OFFSET = 0) -> Octets[]\n\
+Read LEN octets for the instruction to disassemble." },
+  { "set_result", (PyCFunction) disasmpy_info_set_result,
+    METH_VARARGS | METH_KEYWORDS,
+    "set_result (LENGTH, STRING) -> None\n\
+Set the disassembly result, LEN in octets, and disassembly STRING." },
+  { "memory_error", (PyCFunction) disasmpy_info_memory_error,
+    METH_VARARGS | METH_KEYWORDS,
+    "memory_error (OFFSET) -> None\n\
+A memory error occurred when trying to read bytes at OFFSET." },
+  {nullptr}  /* Sentinel */
+};
+
+/* These are the methods we add into the _gdb.disassembler module, which
+   are then imported into the gdb.disassembler module.  These are global
+   functions that support performing disassembly.  */
+
+PyMethodDef python_disassembler_methods[] =
+{
+  { "format_address", (PyCFunction) disasmpy_format_address,
+    METH_VARARGS | METH_KEYWORDS,
+    "format_address (ARCHITECTURE, ADDRESS) -> String.\n\
+Format ADDRESS as a string suitable for use in disassembler output." },
+  { "builtin_disassemble", (PyCFunction) disasmpy_builtin_disassemble,
+    METH_VARARGS | METH_KEYWORDS,
+    "builtin_disassemble (INFO, MEMORY_SOURCE = None) -> None\n\
+Disassemble using GDB's builtin disassembler.  INFO is an instance of\n\
+gdb.disassembler.DisassembleInfo.  The MEMORY_SOURCE, if not None, should\n\
+be an object with the read_memory method." },
+  {nullptr, nullptr, 0, nullptr}
+};
+
+#ifdef IS_PY3K
+/* Structure to define the _gdb.disassembler module.  */
+
+static struct PyModuleDef python_disassembler_module_def =
+{
+  PyModuleDef_HEAD_INIT,
+  "_gdb.disassembler",
+  nullptr,
+  -1,
+  python_disassembler_methods,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr
+};
+#endif
+
+/* Called to initialize the Python structures in this file.  */
+
+int
+gdbpy_initialize_disasm (void)
+{
+  /* Create the _gdb.disassembler module, and add it to the _gdb module.  */
+
+  PyObject *gdb_disassembler_module;
+#ifdef IS_PY3K
+  gdb_disassembler_module = PyModule_Create (&python_disassembler_module_def);
+#else
+  gdb_disassembler_module = Py_InitModule ("_gdb.disassembler",
+					   python_disassembler_methods);
+#endif
+  if (gdb_disassembler_module == nullptr)
+    return -1;
+  PyModule_AddObject(gdb_module, "disassembler", gdb_disassembler_module);
+
+  /* This is needed so that 'import _gdb.disassembler' will work.  */
+  PyObject *dict = PyImport_GetModuleDict ();
+  PyDict_SetItemString (dict, "_gdb.disassembler", gdb_disassembler_module);
+
+  /* Having the tp_new field as nullptr means that this class can't be
+     created from user code.  The only way they can be created is from
+     within GDB, and then they are passed into user code.  */
+  gdb_assert (disasm_info_object_type.tp_new == nullptr);
+  if (PyType_Ready (&disasm_info_object_type) < 0)
+    return -1;
+
+  return gdb_pymodule_addobject (gdb_disassembler_module, "DisassembleInfo",
+				 (PyObject *) &disasm_info_object_type);
+}
+
+/* Describe the gdb.disassembler.DisassembleInfo type.  */
+
+PyTypeObject disasm_info_object_type = {
+  PyVarObject_HEAD_INIT (nullptr, 0)
+  "gdb.disassembler.DisassembleInfo",		/*tp_name*/
+  sizeof (disasm_info_object),			/*tp_basicsize*/
+  0,						/*tp_itemsize*/
+  disasmpy_dealloc,                		/*tp_dealloc*/
+  0,						/*tp_print*/
+  0,						/*tp_getattr*/
+  0,						/*tp_setattr*/
+  0,						/*tp_compare*/
+  0,						/*tp_repr*/
+  0,						/*tp_as_number*/
+  0,						/*tp_as_sequence*/
+  0,						/*tp_as_mapping*/
+  0,						/*tp_hash */
+  0,						/*tp_call*/
+  0,						/*tp_str*/
+  0,						/*tp_getattro*/
+  0,						/*tp_setattro*/
+  0,						/*tp_as_buffer*/
+  Py_TPFLAGS_DEFAULT,				/*tp_flags*/
+  "GDB instruction disassembler object",	/* tp_doc */
+  0,						/* tp_traverse */
+  0,						/* tp_clear */
+  0,						/* tp_richcompare */
+  0,						/* tp_weaklistoffset */
+  0,						/* tp_iter */
+  0,						/* tp_iternext */
+  disasm_info_object_methods,			/* tp_methods */
+  0,						/* tp_members */
+  disasm_info_object_getset			/* tp_getset */
+};
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index 735328b49c4..d0330c81079 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -497,6 +497,8 @@  int gdbpy_initialize_auto_load (void)
   CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
 int gdbpy_initialize_values (void)
   CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
+int gdbpy_initialize_disasm (void)
+  CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
 int gdbpy_initialize_frames (void)
   CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
 int gdbpy_initialize_instruction (void)
@@ -798,4 +800,23 @@  typedef std::unique_ptr<Py_buffer, Py_buffer_deleter> Py_buffer_up;
 extern bool gdbpy_parse_register_id (struct gdbarch *gdbarch,
 				     PyObject *pyo_reg_id, int *reg_num);
 
+/* Implement the 'print_insn' hook for Python.  Disassemble an instruction
+   whose address is ADDRESS for architecture GDBARCH.  The bytes of the
+   instruction should be read with INFO->read_memory_func as the
+   instruction being disassembled might actually be in a buffer.
+
+   Used INFO->fprintf_func to print the results of the disassembly, and
+   return the length of the instruction in octets.
+
+   If no instruction can be disassembled then return an empty value.  */
+
+extern gdb::optional<int> gdbpy_print_insn (struct gdbarch *gdbarch,
+					    CORE_ADDR address,
+					    disassemble_info *info);
+
+/* Return true if OBJ is a gdb.Architecture object, otherwise, return
+   false.  */
+
+bool gdbpy_is_arch_object (PyObject *obj);
+
 #endif /* PYTHON_PYTHON_INTERNAL_H */
diff --git a/gdb/python/python.c b/gdb/python/python.c
index d817bd5bf27..3aba565cd11 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -190,7 +190,7 @@  const struct extension_language_ops python_extension_ops =
 
   gdbpy_colorize,
 
-  NULL, /* gdbpy_print_insn, */
+  gdbpy_print_insn,
 };
 
 /* Architecture and language to be used in callbacks from
@@ -1852,6 +1852,7 @@  do_start_initialization ()
 
   if (gdbpy_initialize_auto_load () < 0
       || gdbpy_initialize_values () < 0
+      || gdbpy_initialize_disasm () < 0
       || gdbpy_initialize_frames () < 0
       || gdbpy_initialize_commands () < 0
       || gdbpy_initialize_instruction () < 0
@@ -2130,6 +2131,14 @@  do_initialize (const struct extension_language_defn *extlang)
       return true;
     }
 
+  /* Import gdb.disassembler now.  The disassembler module provides some
+     parameters that we want to be available to users from the moment GDB
+     starts up.  */
+  PyObject *gdb_disassembler_module
+    = PyImport_ImportModule ("gdb.disassembler");
+  if (gdb_disassembler_module == nullptr)
+    gdbpy_print_stack ();
+
   return gdb_pymodule_addobject (m, "gdb", gdb_python_module) >= 0;
 }
 
diff --git a/gdb/testsuite/gdb.base/style.exp b/gdb/testsuite/gdb.base/style.exp
index 91d3059612d..7aa51cdfe00 100644
--- a/gdb/testsuite/gdb.base/style.exp
+++ b/gdb/testsuite/gdb.base/style.exp
@@ -182,12 +182,26 @@  proc run_style_tests { } {
 
 	gdb_test_no_output "set width 0"
 
-	set main [limited_style main function]
-	set func [limited_style some_called_function function]
-	# Somewhere should see the call to the function.
-	gdb_test "disassemble main" \
-	    [concat "Dump of assembler code for function $main:.*" \
-		 "[limited_style $hex address].*$func.*"]
+	# Disassembly highlighting is done by Python, so, if the
+	# required modules are not available we'll not get the full
+	# highlighting.
+	if { $::python_disassembly_highlighting } {
+	    # Check that the header line of the disassembly output is
+	    # styled correctly, the address at the start of the first
+	    # disassembly line is styled correctly, and that there is at
+	    # least one escape sequence in the disassembly output.
+	    set main [limited_style main function]
+	    gdb_test "disassemble main" \
+		[concat "Dump of assembler code for function $main:\\r\\n" \
+		     "\\s+[limited_style $hex address]\\s+<\\+$decimal>:\[^\\r\\n\]+\033\\\[${decimal}\[^\\r\\n\]+.*" ""]
+	} else {
+	    set main [limited_style main function]
+	    set func [limited_style some_called_function function]
+	    # Somewhere should see the call to the function.
+	    gdb_test "disassemble main" \
+		[concat "Dump of assembler code for function $main:.*" \
+		     "[limited_style $hex address].*$func.*"]
+	}
 
 	set ifield [limited_style int_field variable]
 	set sfield [limited_style string_field variable]
@@ -312,6 +326,25 @@  proc test_startup_version_string { } {
     gdb_test "" "${vers}.*" "version is styled at startup"
 }
 
+# Check to see if the Python highlighting of disassembler output is
+# expected or not, this highlighting requires Python support in GDB,
+# and the Python pygments module to be available.
+clean_restart ${binfile}
+if {![skip_python_tests]} {
+    gdb_test_multiple "python import pygments" "" {
+	-re "ModuleNotFoundError: No module named 'pygments'.*$gdb_prompt $" {
+	    set python_disassembly_highlighting false
+	}
+	-re "ImportError: No module named pygments.*$gdb_prompt $" {
+	    set python_disassembly_highlighting false
+	}
+	-re "^python import pygments\r\n$gdb_prompt $" {
+	    set python_disassembly_highlighting true
+	}
+    }
+} else {
+    set python_disassembly_highlighting false
+}
 
 # Run tests with all styles in their default state.
 with_test_prefix "all styles enabled" {
diff --git a/gdb/testsuite/gdb.python/py-disasm.c b/gdb/testsuite/gdb.python/py-disasm.c
new file mode 100644
index 00000000000..1d89a49c346
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-disasm.c
@@ -0,0 +1,25 @@ 
+/* This test program is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int
+main ()
+{
+  asm ("nop");
+  asm ("nop");	/* Break here.  */
+  asm ("nop");
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.python/py-disasm.exp b/gdb/testsuite/gdb.python/py-disasm.exp
new file mode 100644
index 00000000000..f8d6140036d
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-disasm.exp
@@ -0,0 +1,201 @@ 
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the GDB testsuite.  It validates the Python
+# disassembler API.
+
+load_lib gdb-python.exp
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} "debug"] } {
+    return -1
+}
+
+# Skip all tests if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+if ![runto_main] then {
+    fail "can't run to main"
+    return 0
+}
+
+set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
+
+gdb_test "source ${pyfile}" "Python script imported" \
+         "import python scripts"
+
+gdb_breakpoint [gdb_get_line_number "Break here."]
+gdb_continue_to_breakpoint "Break here."
+
+set curr_pc [get_valueof "/x" "\$pc" "*unknown*"]
+
+gdb_test_no_output "python current_pc = ${curr_pc}"
+
+# The current pc will be something like 0x1234 with no leading zeros.
+# However, in the disassembler output addresses are padded with zeros.
+# This substitution changes 0x1234 to 0x0*1234, which can then be used
+# as a regexp in the disassembler output matching.
+set curr_pc_pattern [string replace ${curr_pc} 0 1 "0x0*"]
+
+# Grab the name of the current architecture, this is used in the tests
+# patterns below.
+set curr_arch [get_python_valueof "gdb.selected_inferior().architecture().name()" "*unknown*"]
+
+# Helper proc that removes all registered disassemblers.
+proc py_remove_all_disassemblers {} {
+    gdb_test_no_output "python remove_all_python_disassemblers()"
+}
+
+# A list of test plans.  Each plan is a list of two elements, the
+# first element is the name of a class in py-disasm.py, this is a
+# disassembler class.  The second element is a pattern that should be
+# matched in the disassembler output.
+#
+# Each different disassembler tests some different feature of the
+# Python disassembler API.
+set addr_pattern "\r\n=> ${curr_pc_pattern} <\[^>\]+>:\\s+"
+set base_pattern "${addr_pattern}nop"
+set test_plans \
+    [list \
+	 [list "" "${base_pattern}\r\n.*"] \
+	 [list "GlobalNullDisassembler" "${base_pattern}\r\n.*"] \
+	 [list "GlobalPreInfoDisassembler" "${base_pattern}\\s+## ad = $hex, st = None, le = None, ar = ${curr_arch}\r\n.*"] \
+	 [list "GlobalPostInfoDisassembler" "${base_pattern}\\s+## ad = $hex, st = nop, le = $decimal, ar = ${curr_arch}\r\n.*"] \
+	 [list "GlobalEscDisassembler" "${base_pattern}\\s+## style = False\r\n.*"] \
+	 [list "GlobalReadDisassembler" "${base_pattern}\\s+## bytes =( $hex)+\r\n.*"] \
+	 [list "GlobalAddrDisassembler" "${base_pattern}\\s+## addr = ${curr_pc_pattern} <\[^>\]+>\r\n.*"] \
+	 [list "SimpleMemoryErrorDisassembler" "${addr_pattern}Cannot access memory at address ${curr_pc_pattern}"] \
+	 [list "NonMemoryErrorEarlyDisassembler" "${addr_pattern}Python Exception <class 'gdb\\.GdbError'>: error before setting a result\r\nnop\r\n.*"] \
+	 [list "NonMemoryErrorLateDisassembler" "${addr_pattern}Python Exception <class 'gdb\\.GdbError'>: error after setting a result\r\nnop\r\n.*"] \
+	 [list "MemoryErrorEarlyDisassembler" "${base_pattern}\r\n.*"] \
+	 [list "MemoryErrorLateDisassembler" "${base_pattern}\r\n.*"] \
+	 [list "CaughtMemoryErrorEarlyDisassembler" "${addr_pattern}Cannot access memory at address 0x2"] \
+	 [list "CaughtMemoryErrorLateDisassembler" "${addr_pattern}Cannot access memory at address 0x2"] \
+	 [list "CaughtMemoryErrorEarlyAndReplaceDisassembler" "${base_pattern}\\s+## tag = GOT MEMORY ERROR\r\n.*"] \
+	 [list "SetResultBeforeBuiltinDisassembler" "${base_pattern}\r\n.*"]]
+
+# Now execute each test plan.
+foreach plan $test_plans {
+    set global_disassembler_name [lindex $plan 0]
+    set expected_pattern [lindex $plan 1]
+
+    with_test_prefix "global_disassembler=${global_disassembler_name}" {
+	# Remove all existing disassemblers.
+	py_remove_all_disassemblers
+
+	# If we have a disassembler to load, do it now.
+	if { $global_disassembler_name != "" } {
+	    gdb_test_no_output "python add_global_disassembler($global_disassembler_name)"
+	}
+
+	# Disassemble main, and check the disassembler output.
+	gdb_test "disassemble main" $expected_pattern
+    }
+}
+
+# Check that the architecture specific disassemblers can override the
+# global disassembler.
+#
+# First, register a global disassembler, and check it is in place.
+with_test_prefix "GLOBAL tagging disassembler" {
+    py_remove_all_disassemblers
+    gdb_test_no_output "python gdb.disassembler.register_disassembler(TaggingDisassembler(\"GLOBAL\"), None)"
+    gdb_test "disassemble main" "${base_pattern}\\s+## tag = GLOBAL\r\n.*"
+}
+
+# Now register an architecture specific disassembler, and check it
+# overrides the global disassembler.
+with_test_prefix "LOCAL tagging disassembler" {
+    gdb_test_no_output "python gdb.disassembler.register_disassembler(TaggingDisassembler(\"LOCAL\"), \"${curr_arch}\")"
+    gdb_test "disassemble main" "${base_pattern}\\s+## tag = LOCAL\r\n.*"
+}
+
+# Now remove the architecture specific disassembler, and check that
+# the global disassembler kicks back in.
+with_test_prefix "GLOBAL tagging disassembler again" {
+    gdb_test_no_output "python gdb.disassembler.register_disassembler(None, \"${curr_arch}\")"
+    gdb_test "disassemble main" "${base_pattern}\\s+## tag = GLOBAL\r\n.*"
+}
+
+# Check that a DisassembleInfo becomes invalid after the call into the
+# disassembler.
+with_test_prefix "DisassembleInfo becomes invalid" {
+    py_remove_all_disassemblers
+    gdb_test_no_output "python add_global_disassembler(GlobalCachingDisassembler)"
+    gdb_test "disassemble main" "${base_pattern}\\s+## CACHED\r\n.*"
+    gdb_test "python GlobalCachingDisassembler.check()" "PASS"
+}
+
+# Test the memory source aspect of the builtin disassembler.
+with_test_prefix "memory source api" {
+    py_remove_all_disassemblers
+    gdb_test_no_output "python gdb.disassembler.register_disassembler(analyzing_disassembler)"
+    gdb_test "disassemble main" "${base_pattern}\r\n.*"
+    gdb_test "python analyzing_disassembler.find_replacement_candidate()" \
+	"Replace from $hex to $hex with NOP"
+    gdb_test "disassemble main" "${base_pattern}\r\n.*" \
+	"second disassembler pass"
+    gdb_test "python analyzing_disassembler.check()" \
+	"PASS"
+}
+
+# The syntax highlighting disassembler makes use of the pygments
+# module.  Try importing the module now, if this fails then we can
+# skip the tests that check the syntax highlighting.
+gdb_test_multiple "python import pygments" "" {
+    -re "ModuleNotFoundError: No module named 'pygments'.*$gdb_prompt $" {
+	set pygments_module_available false
+    }
+    -re "ImportError: No module named pygments.*$gdb_prompt $" {
+	set pygments_module_available false
+    }
+    -re "^python import pygments\r\n$gdb_prompt $" {
+	set pygments_module_available true
+    }
+}
+
+if { $pygments_module_available } {
+    # Test the syntax highlighting disassembler.
+    with_test_prefix "syntax highlighting" {
+	py_remove_all_disassemblers
+	save_vars { env(TERM) } {
+	    # We need an ANSI-capable terminal to get the output.
+	    setenv TERM ansi
+
+	    clean_restart ${binfile}
+
+	    if ![runto_main] then {
+		fail "can't run to main"
+		return 0
+	    }
+
+	    gdb_test "source ${pyfile}" "Python script imported" \
+		"import python scripts"
+
+	    gdb_breakpoint [gdb_get_line_number "Break here."]
+	    gdb_continue_to_breakpoint "Break here."
+
+	    gdb_test_no_output "python current_pc = ${curr_pc}"
+
+	    gdb_test_no_output "python add_global_disassembler(GlobalColorDisassembler)"
+	    set styled_nop "\033\\\[\[0-9\]+(;\[0-9\]+)?mnop\033\\\[\[^m\]+m"
+	    set styled_address [style "${curr_pc_pattern}" address]
+	    gdb_test "disassemble main" "\r\n=> ${styled_address} <\[^>\]+>:\\s+${styled_nop}\r\n.*"
+	}
+    }
+} else {
+    untested "disassemble with styling"
+}
diff --git a/gdb/testsuite/gdb.python/py-disasm.py b/gdb/testsuite/gdb.python/py-disasm.py
new file mode 100644
index 00000000000..2cfcb7ceaff
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-disasm.py
@@ -0,0 +1,538 @@ 
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import gdb
+import gdb.disassembler
+import struct
+import sys
+
+from gdb.disassembler import Disassembler
+
+# A global, holds the program-counter address at which we should
+# perform the extra disassembly that this script provides.
+current_pc = None
+
+
+def remove_all_python_disassemblers():
+    for a in gdb.architecture_names():
+        gdb.disassembler.register_disassembler(None, a)
+    gdb.disassembler.register_disassembler(None, None)
+
+
+class TestDisassembler(Disassembler):
+    """A base class for disassemblers within this script to inherit from.
+       Implements the __call__ method and ensures we only do any
+       disassembly wrapping for the global CURRENT_PC."""
+
+    def __init__(self):
+        global current_pc
+
+        super(TestDisassembler, self).__init__("TestDisassembler")
+        if current_pc == None:
+            raise gdb.GdbError("no current_pc set")
+
+    def __call__(self, info):
+        global current_pc
+
+        if info.address != current_pc:
+            return None
+        return self.disassemble(info)
+
+    def disassemble(self, info):
+        raise NotImplementedError("override the disassemble method")
+
+
+class GlobalPreInfoDisassembler(TestDisassembler):
+    """Check the attributes of DisassembleInfo before disassembly has occurred."""
+
+    def disassemble(self, info):
+        ad = info.address
+        st = info.string
+        le = info.length
+        ar = info.architecture
+
+        if le is not None:
+            raise gdb.GdbError("invalid length")
+
+        if st is not None:
+            raise gdb.GdbError("invaild string")
+
+        if ad != current_pc:
+            raise gdb.GdbError("invalid address")
+
+        gdb.disassembler.builtin_disassemble(info)
+
+        text = info.string + "\t## ad = 0x%x, st = %s, le = %s, ar = %s" % (
+            ad,
+            st,
+            le,
+            ar.name(),
+        )
+        info.set_result(info.length, text)
+
+
+class GlobalPostInfoDisassembler(TestDisassembler):
+    """Check the attributes of DisassembleInfo after disassembly has occurred."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+
+        ad = info.address
+        st = info.string
+        le = info.length
+        ar = info.architecture
+
+        if ad != current_pc:
+            raise gdb.GdbError("invalid address")
+
+        if st is None or st == "":
+            raise gdb.GdbError("invalid string")
+
+        if le <= 0:
+            raise gdb.GdbError("invalid length")
+
+        text = info.string + "\t## ad = 0x%x, st = %s, le = %d, ar = %s" % (
+            ad,
+            st,
+            le,
+            ar.name(),
+        )
+        info.set_result(info.length, text)
+
+
+class GlobalEscDisassembler(TestDisassembler):
+    """Check the can_emit_style_escape attribute."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        text = info.string + "\t## style = %s" % info.can_emit_style_escape
+        info.set_result(info.length, text)
+
+
+class GlobalReadDisassembler(TestDisassembler):
+    """Check the DisassembleInfo.read method."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        len = info.length
+        str = ""
+        for o in range(len):
+            if str != "":
+                str += " "
+            v = bytes(info.read_memory(1, o))[0]
+            if sys.version_info[0] < 3:
+                v = struct.unpack ('<B', v)
+            str += "0x%02x" % v
+        text = info.string + "\t## bytes = %s" % str
+        info.set_result(info.length, text)
+
+
+class GlobalAddrDisassembler(TestDisassembler):
+    """Check the gdb.disassembler.format_address method."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        arch = info.architecture
+        addr = info.address
+        str = gdb.disassembler.format_address(arch, addr)
+        text = info.string + "\t## addr = %s" % str
+        info.set_result(info.length, text)
+
+
+class NonMemoryErrorEarlyDisassembler(TestDisassembler):
+    """Throw an error (not a memory error) before setting a result."""
+
+    def disassemble(self, info):
+        raise gdb.GdbError("error before setting a result")
+        gdb.disassembler.builtin_disassemble(info)
+
+
+class NonMemoryErrorLateDisassembler(TestDisassembler):
+    """Throw an error (not a memory error) after setting a result."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        raise gdb.GdbError("error after setting a result")
+
+
+class MemoryErrorEarlyDisassembler(TestDisassembler):
+    """Throw a memory error before setting a result."""
+
+    def disassemble(self, info):
+        info.read_memory(1, -info.address + 2)
+        gdb.disassembler.builtin_disassemble(info)
+
+
+class MemoryErrorLateDisassembler(TestDisassembler):
+    """Throw a memoryh error after setting a result."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        info.read_memory(1, -info.address + 2)
+
+
+class SimpleMemoryErrorDisassembler(TestDisassembler):
+    """Some basic testing around setting memory errors, ensure that the
+    length and string return to None after setting a memory error."""
+
+    def disassemble(self, info):
+        if info.length is not None:
+            raise gdb.GdbError("length is not None before")
+        if info.string is not None:
+            raise gdb.GdbError("string is not None before")
+        info.set_result(1, "!! INVALID !! ")
+        info.memory_error(0)
+        if info.length is not None:
+            raise gdb.GdbError("length is not None after")
+        if info.string is not None:
+            raise gdb.GdbError("string is not None after")
+
+
+class CaughtMemoryErrorEarlyDisassembler(TestDisassembler):
+    """Throw a memory error before setting a result."""
+
+    def disassemble(self, info):
+        try:
+            info.read_memory(1, -info.address + 2)
+        except gdb.MemoryError as e:
+            info.memory_error(-info.address + 2)
+            return None
+        gdb.disassembler.builtin_disassemble(info)
+
+
+class CaughtMemoryErrorLateDisassembler(TestDisassembler):
+    """Throw a memoryh error after setting a result."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        try:
+            info.read_memory(1, -info.address + 2)
+        except gdb.MemoryError as e:
+            # This memory error will discard the earlier result and
+            # mark this disassembly as failed with a memory error.
+            info.memory_error(-info.address + 2)
+
+
+class SetResultBeforeBuiltinDisassembler(TestDisassembler):
+    """Set a result, then call the builtin disassembler."""
+
+    def disassemble(self, info):
+        info.set_result(1, "!! DISCARD THIS TEXT !! ")
+        gdb.disassembler.builtin_disassemble(info)
+
+
+class CaughtMemoryErrorEarlyAndReplaceDisassembler(TestDisassembler):
+    """Throw a memory error before setting a result."""
+
+    def disassemble(self, info):
+        tag = "NO MEMORY ERROR"
+        try:
+            info.read_memory(1, -info.address + 2)
+        except gdb.MemoryError as e:
+            info.memory_error(-info.address + 2)
+            tag = "GOT MEMORY ERROR"
+
+        # This disassembly will replace the earlier memory error
+        # marker, and leave this instruction disassembling just fine,
+        # however, the tag that we add will tell us that we did see a
+        # memory error.
+        gdb.disassembler.builtin_disassemble(info)
+        text = info.string + "\t## tag = %s" % tag
+        info.set_result(info.length, text)
+
+
+class TaggingDisassembler(TestDisassembler):
+    """A simple disassembler that just tags the output."""
+
+    def __init__(self, tag):
+        super(TaggingDisassembler, self).__init__()
+        self._tag = tag
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        text = info.string + "\t## tag = %s" % self._tag
+        info.set_result(info.length, text)
+
+
+class GlobalColorDisassembler(TestDisassembler):
+    """A disassembler performs syntax highlighting."""
+
+    def disassemble(self, info):
+        gdb.disassembler.builtin_disassemble(info)
+        gdb.disassembler.syntax_highlight(info)
+
+
+class GlobalCachingDisassembler(TestDisassembler):
+    """A disassembler that caches the DisassembleInfo that is passed in. Once
+    the call into the disassembler is complete then the DisassembleInfo
+    becomes invalid, and any calls into it should trigger an
+    exception."""
+
+    # This is where we cache the DisassembleInfo object.
+    cached_insn_disas = None
+
+    def disassemble(self, info):
+        """Disassemble the instruction, add a CACHED comment to the output,
+        and cache the DisassembleInfo so that it is not garbage collected."""
+        GlobalCachingDisassembler.cached_insn_disas = info
+        gdb.disassembler.builtin_disassemble(info)
+        text = info.string + "\t## CACHED"
+        info.set_result(info.length, text)
+
+    @staticmethod
+    def check():
+        """Check that all of the methods on the cached DisassembleInfo trigger an
+        exception."""
+        info = GlobalCachingDisassembler.cached_insn_disas
+        assert isinstance(info, gdb.disassembler.DisassembleInfo)
+        try:
+            val = info.address
+            raise gdb.GdbError("DisassembleInfo.address is still valid")
+        except RuntimeError as e:
+            assert str(e) == "DisassembleInfo is no longer valid."
+        except:
+            raise gdb.GdbError("DisassembleInfo.address raised an unexpected exception")
+
+        try:
+            val = info.string
+            raise gdb.GdbError("DisassembleInfo.string is still valid")
+        except RuntimeError as e:
+            assert str(e) == "DisassembleInfo is no longer valid."
+        except:
+            raise gdb.GdbError("DisassembleInfo.string raised an unexpected exception")
+
+        try:
+            val = info.length
+            raise gdb.GdbError("DisassembleInfo.length is still valid")
+        except RuntimeError as e:
+            assert str(e) == "DisassembleInfo is no longer valid."
+        except:
+            raise gdb.GdbError("DisassembleInfo.length raised an unexpected exception")
+
+        try:
+            val = info.architecture
+            raise gdb.GdbError("DisassembleInfo.architecture is still valid")
+        except RuntimeError as e:
+            assert str(e) == "DisassembleInfo is no longer valid."
+        except:
+            raise gdb.GdbError(
+                "DisassembleInfo.architecture raised an unexpected exception"
+            )
+
+        try:
+            val = info.read_memory(1, 0)
+            raise gdb.GdbError("DisassembleInfo.read is still valid")
+        except RuntimeError as e:
+            assert str(e) == "DisassembleInfo is no longer valid."
+        except:
+            raise gdb.GdbError("DisassembleInfo.read raised an unexpected exception")
+
+        try:
+            val = info.set_result(1, "XXX")
+            raise gdb.GdbError("DisassembleInfo.set_result is still valid")
+        except RuntimeError as e:
+            assert str(e) == "DisassembleInfo is no longer valid."
+        except:
+            raise gdb.GdbError(
+                "DisassembleInfo.set_result raised an unexpected exception"
+            )
+
+        print("PASS")
+
+
+class GlobalNullDisassembler(TestDisassembler):
+    """A disassembler that does not change the output at all."""
+
+    def disassemble(self, info):
+        pass
+
+
+class AnalyzingDisassembler(Disassembler):
+    def __init__(self, name):
+        """Constructor."""
+        super(AnalyzingDisassembler, self).__init__(name)
+
+        # Details about the instructions found during the first disassembler
+        # pass.
+        self._pass_1_length = []
+        self._pass_1_insn = []
+        self._pass_1_address = []
+
+        # The start and end address for the instruction we will replace with
+        # one or more 'nop' instructions during pass two.
+        self._start = None
+        self._end = None
+
+        # The index in the _pass_1_* lists for where the nop instruction can
+        # be found, also, the buffer of bytes that make up a nop instruction.
+        self._nop_index = None
+        self._nop_bytes = None
+
+        # The DisassembleInfo object passed into __call__ as INFO.
+        self._info = None
+
+        # A flag that indicates if we are in the first or second pass of
+        # this disassembler test.
+        self._first_pass = True
+
+        # The disassembled instructions collected during the second pass.
+        self._pass_2_insn = []
+
+        # A copy of _pass_1_insn that has been modified to include the extra
+        # 'nop' instructions we plan to insert during the second pass.  This
+        # is then checked against _pass_2_insn after the second disassembler
+        # pass has completed.
+        self._check = []
+
+    def __call__(self, info):
+        """Called to perform the disassembly."""
+
+        # Record INFO, we'll need to refer to this in READ_MEMORY which is
+        # called back to by the builtin disassembler.
+        self._info = info
+        gdb.disassembler.builtin_disassemble(info, self)
+
+        # Record some informaiton about the first 'nop' instruction we find.
+        if self._nop_index is None and info.string == "nop":
+            self._nop_index = len(self._pass_1_length)
+            # The offset in the following read_memory call defaults to 0.
+            self._nop_bytes = info.read_memory(info.length)
+
+        # Record information about each instruction that is disassembled.
+        # This test is performed in two passes, and we need different
+        # information in each pass.
+        if self._first_pass:
+            self._pass_1_length.append(info.length)
+            self._pass_1_insn.append(info.string)
+            self._pass_1_address.append(info.address)
+        else:
+            self._pass_2_insn.append(info.string)
+
+    def _read_replacement(self, length, offset):
+        """Return a slice of the buffer representing the replacement nop
+        instructions."""
+
+        assert(self._nop_bytes is not None)
+        rb = self._nop_bytes
+
+        # If this request is outside of a nop instruction then we don't know
+        # what to do, so just raise a memory error.
+        if offset >= len(rb) or (offset + length) > len(rb):
+            raise gdb.MemoryError("invalid length and offset combination")
+
+        # Return only the slice of the nop instruction as requested.
+        s = offset
+        e = offset + length
+        return rb[s:e]
+
+    def read_memory(self, len, offset):
+        """Callback used from the builtin disassembler to read the contents of
+        memory."""
+
+        info = self._info
+        assert info is not None
+
+        # If this request is within the region we are replacing with 'nop'
+        # instructions, then call the helper function to perform that
+        # replacement.
+        if self._start is not None:
+            assert self._end is not None
+            if info.address >= self._start and info.address < self._end:
+                return self._read_replacement(len, offset)
+
+        # Otherwise, we just forward this request to the default read memory
+        # implementation.
+        return info.read_memory(len, offset)
+
+    def find_replacement_candidate(self):
+        """Call this after the first disassembly pass.  This identifies a suitable
+        instruction to replace with 'nop' instruction(s)."""
+
+        if self._nop_index is None:
+            raise gdb.GdbError("no nop was found")
+
+        nop_idx = self._nop_index
+        nop_length = self._pass_1_length[nop_idx]
+
+        # First we look for an instruction that is larger than a nop
+        # instruction, but whose length is an exact multiple of the nop
+        # instruction's length.
+        replace_idx = None
+        for idx in range(len(self._pass_1_length)):
+            if (
+                idx > 0
+                and idx != nop_idx
+                and self._pass_1_insn[idx] != "nop"
+                and self._pass_1_length[idx] > self._pass_1_length[nop_idx]
+                and self._pass_1_length[idx] % self._pass_1_length[nop_idx] == 0
+            ):
+                replace_idx = idx
+                break
+
+        # If we still don't have a replacement candidate, then search again,
+        # this time looking for an instruciton that is the same length as a
+        # nop instruction.
+        if replace_idx is None:
+            for idx in range(len(self._pass_1_length)):
+                if (
+                    idx > 0
+                    and idx != nop_idx
+                    and self._pass_1_insn[idx] != "nop"
+                    and self._pass_1_length[idx] == self._pass_1_length[nop_idx]
+                ):
+                    replace_idx = idx
+                    break
+
+        # Weird, the nop instruction must be larger than every other
+        # instruction, or all instructions are 'nop'?
+        if replace_idx is None:
+            raise gdb.GdbError("can't find an instruction to replace")
+
+        # Record the instruction range that will be replaced with 'nop'
+        # instructions, and mark that we are now on the second pass.
+        self._start = self._pass_1_address[replace_idx]
+        self._end = self._pass_1_address[replace_idx] + self._pass_1_length[replace_idx]
+        self._first_pass = False
+        print("Replace from 0x%x to 0x%x with NOP" % (self._start, self._end))
+
+        # Finally, build the expected result.  Create the _check list, which
+        # is a copy of _pass_1_insn, but replace the instruction we
+        # identified above with a series of 'nop' instructions.
+        self._check = list (self._pass_1_insn)
+        nop_count = int(self._pass_1_length[replace_idx] / self._pass_1_length[nop_idx])
+        nops = ["nop"] * nop_count
+        self._check[replace_idx : (replace_idx + 1)] = nops
+
+    def check(self):
+        """Call this after the second disassembler pass to validate the output."""
+        if self._check != self._pass_2_insn:
+            raise gdb.GdbError("mismatch")
+        print("PASS")
+
+# Create a global instance of the AnalyzingDisassembler.  This isn't
+# registered as a disassembler yet though, that is done from the
+# py-diasm.exp later.
+analyzing_disassembler = AnalyzingDisassembler("AnalyzingDisassembler")
+
+def add_global_disassembler(dis_class):
+    """Create an instance of DIS_CLASS and register it as a global disassembler."""
+    dis = dis_class()
+    gdb.disassembler.register_disassembler(dis, None)
+
+
+# Start with all disassemblers removed.
+remove_all_python_disassemblers()
+
+print("Python script imported")