LearningCOM

From HerzbubeWiki
Jump to navigation Jump to search

2011 is the year that will go down in my personal history as the year when I was confronted with Microsoft COM for the first time, in the form of a C++/C#/VBA codebase that had "matured" over 20 years and where COM had been liberally used for all sorts of interaction between various parts of an application. My subsequent struggle to understand what COM is all about has resulted in a large amount of condensed notes, which I try to present on this page in a semi-organized way.

A couple of things have (sometimes greatly) frustrated me in my efforts to understand COM, and I feel compelled to mention them here:

  • A lot of half-knowledge is floating around out there that makes it difficult to grasp some concepts.
  • Sometimes different sources use different terminology to mean the same thing, which can make it very difficult to connect the knowledge imparted by two articles. An example is the meaning of early/late binding (did you ever hear of very early binding?). The sad thing is that this even extends to MSDN, especially when you read articles that were written years apart by different authors.
  • Microsoft's confusing use of marketing names such as ActiveX and OLE.

Despite all difficulties, I believe I have succeeded fairly well in accumulating a working knowledge of COM. Yet there is still a lot left to be explored, so this page is bound to get some updates in the future.


Glossary

COM
Component Object Model. COM is the base for the following technologies (some of which are mainly marketing names invented by Microsoft): OLE, OLE Automation, ActiveX, COM+, DCOM.
IDL
Interface Description (or Definition) Language. This is a general term and has nothing to do with COM per se. An IDL allows to describe interfaces in a way that is independent of any programming languages. Examples for systems that use an IDL are CORBA and COM.
MIDL
Microsoft Interface Definition Language. This is Microsoft's extension of OMG IDL. The extension adds specifics for COM and DCOM. MIDL files are compiled using the MIDL Compiler.
ODL
Object Description Language. An IDL defined by Microsoft to describe interfaces for OLE Automation. Today ODL has been superseded by MIDL. Originally, ODL files were compiled using a tool called mktyplib, but that tool has become obsolete and nowadays the MIDL compiler has taken over its job. This MSDN article explains the difference between mktyplib and the MIDL compiler.
CLSID
Class ID. A UUID identifying a COM class.
IID
Interface ID. A UUID identifying a COM interface.
LIBID
Library ID. A UUID identifying a COM type library.
CoClass
COM class. A COM class is a concrete implementation of one or more COM interfaces.
Late binding
No strong type checking at compile time, method call is set up at runtime.
Interop Marshaling
The process of marshaling data across the boundary between an unmanaged and a managed environment.
Managed environment
The .NET environment. Managed code is written in any of the various .NET languages and is executed under the management of the .NET CLR virtual machine.
Unmanaged environment
Unmanaged code is written in a non .NET language such as C++ or Delphi and is executed natively.
RPC
Remote Procedure Call. A general concept that enables a client to call functions of a component/service that is located on a remote system. DCOM uses MSRPC as its RPC mechanism.
ATL registrar script
An .rgs file, intended to be embedded as a resource into a native .dll or .exe. At runtime the script is consumed by the ATL registrar service, which populates the Windows Registry with keys/values according to the directives in the registrar script. Registrar scripts are typically used to implement self-registration of native COM servers. See MSDN [1] for more details about .rgs files.
DISPID
Abbreviation for Dispatch ID.
Dispatch ID
The numeric ID of a function in an interface that extends IDispatch.
Dispinterface
A COM interface whose functions can only be called via the "Automation" call mechanism, not via the "vtable" call mechanism. "Dispinterface" is probably short for "IDispatch interface", because the "Automation" call mechanism is implemented via IDispatch.
Dual interface
A COM interface whose functions can be called both via the "Automation" and the "vtable" call mechanism.
Automation interface
A COM interface whose functions can be called via the "Automation" call mechanism. It is not entirely clear whether a dual interface is also an Automation interface, or whether Automation interface refers only to dispinterfaces.


Main references

  • The MSDN entry point for COM: [2]
  • MSDN guide that introduces COM: [3]
  • KB article that explains the different forms of binding: [4]


Overview and basic terminology

Basics

  • COM is the foundation technology for OLE, ActiveX and other technologies
  • COM allows software components and their consumers to be written in different programming languages. DCOM allows components and their consumers to be located on different computer systems.


Basic terms

  • COM is based on interfaces that contain functions
  • A COM interface is implemented by a COM class, also known as CoClass
  • A COM class may implement more than one COM interface
  • Every COM interface and COM class is identified by a UUID, also known as IID (short for interface ID) and CLSID (short for class ID), which is defined at design time
  • In addition to a CLSID, a COM class may also be identified by a so-called "ProgID", a human-friendly identifier that can be used to identify the COM class instead of the technical CLSID. A ProgID is not a replacement for a CLSID, it is merely a synonym. A ProgID always resolves to exactly one CLSID (but a CLSID may be referred to by multiple ProgIDs). A ProgID must fulfill several requirements [5] (e.g. maximum length 39 characters) but - unlike the CLSID - is not guaranteed to be globally unique because someone else could come up with the same human-friendly ProgID.
  • A .dll or .exe file that contains one or more COM classes is called a "COM server"
  • An agent that instantiates a COM class is called a "COM client"
  • An instantiated COM class is called a "COM object" or "COM component". After instantation, a COM object is referenced only by one of the interfaces that it implements.
  • The description of one or more COM interfaces can be stored in a so-called "type library", sometimes abbreviated to "TLB". A type library has two uses:
    1. It can be used at design time to provide type information to the developer of a COM client (e.g. an IDE may provide code completion). This may or may not result in the COM client using early binding.
    2. It may be required at runtime to provide type information when a COM client accesses a COM object via late binding
  • A COM client is said to use "late binding" to access a COM object if only minimal knowledge about the COM interface is compiled into the COM client at design time, and the specific knowledge required to execute a function call is looked up at runtime. For instance, at design time the client only knows it is calling a function named "Foo", the information that this is the 3rd function of the interface must be looked up at runtime.
  • A COM client is said to use "early binding" to access a COM object if additional knowledge about the COM interface it is using is compiled into the COM client at design time so that this knowledge is immediately available at runtime without costly lookup. For instance, at design time the knowledge that the function "Foo" is the 3rd function of the interface is already compiled into the client.


COM interfaces

  • All COM interfaces inherit from a base interface named IUnknown that has a well-known IID


Reference counting

  • COM works on the basis of reference counting
  • When a COM object is created its reference count is 1
  • During the lifecycle of the COM object, the reference count may fluctuate in the range >= 1
  • When a COM object's reference count reaches 0, the object is destroyed


Role of the Windows Registry

In order for COM to work at runtime, the Windows Registry play a crucial role:

  • The Windows Registry maintains information about all the COM types (interfaces, classes, etc.) provided by COM servers installed on the system
  • The Windows Registry also maintains information about any type libraries required for late binding COM clients to COM objects at runtime
  • The information in the Windows Registry must obey certain well-known rules so that the COM system can find it
  • There are many ways how the necessary information can be placed in the Windows Registry. Two well-known tools that perform the registration process are regsvr32.exe (for native .dll files) and regasm.exe (for .NET assemblies).
  • A process called "registration-free COM" exists which obviates the need for registering COM artifacts. This needs further research, though (TODO).


Note: Type libraries may also be registered to provide type information to developers of a COM client at design time.


Sequence of events

The following is an overview of the sequence of events that take place when a COM object is created and a function is invoked:

  • The COM client attempts to instantiate a COM class. The client provides either the CLSID or the ProgID to identify the COM class.
  • The COM system consults the registry to resolve the ProgID into a CLSID (if a ProgID was used), and then to resolve the CLSID into the path name of the COM server (a .dll or .exe file) that contains the COM class
    • Additional things happen if the COM server is on a remote system. I have not researched this.
  • Windows either loads the COM server into the process space of the COM client application (in-process server, must be a .dll), or it starts a new process for the COM server (out-of-process server, usually an .exe)
  • The COM server creates an instance of the desired COM class and returns to the client a reference to one of the interfaces implemented by the COM class
  • The COM client calls a function of the interface. The function call can take one of several routes until it gets to its destination:
    • When both COM client and COM server live in the same process and are written in native C++, such a function call may be resolved in a very direct manner, similar to a native C++ function call
    • When both COM client and COM server live in the same process, but one is written in native C++ and the other in a managed .NET language, then the function call will have to cross the boundary between the managed/unmanaged environments. This process is called Interop Marshaling.
    • When COM client and COM server live in different processes, some sort of inter-process communication takes place. I don't know the nature of this communication (TODO).
    • When COM client and COM server live on different machines, i.e. in a truly distributed environment, the function call is made via Remote Procedure Calls (RPC)
  • Regardless of the route taken, any parameter values are marshaled from COM client to COM object
  • The COM object executes code and returns the result
  • The result value is marshaled back from the COM object to the COM client


COM function calls may also occur asynchronously. This is not covered here, though.


vtable/Automation calls and Early/Late binding

vtable/Automation calls

There are two mechanisms in COM how to call the methods of an interface:

  • The "vtable" mechanism
  • The "Automation" mechanism


The "vtable" mechanism:

  • A COM interface that supports the "vtable" call mechanism has a so-called "vtable": Basically this is a list of the functions that make up the interface.
    • The term "vtable" is based on the C++ "vtable", which is used to call virtual methods in that language
  • The index position of each function in the vtable is important as it uniquely identifies the function.
  • The order in which functions are declared in an IDL file defines the position of each function in the interface's vtable
  • When the compiler finds a function call that uses the "vtable" mechanism, it looks up the position of the function in the interface's vtable and stores the index position in the object file (the object file is the output that the compiler produces)
  • At runtime, the function is called using the index position that was compiled into the executable
  • Furthermore, the data types of the function's parameters and return value are also determined at compile time


The "Automation" mechanism:

  • A COM interface that supports the "Automation" call mechanism is often simply called an "Automation interface"
  • A COM interface that supports the "Automation" call mechanism assigns a unique ID to each of the functions that make up the interface. This ID is called "Dispatch ID", or DISPID.
  • At runtime, the function is called using the dispatch ID. The dispatch ID is determined in one of two ways:
    • Either the caller already knows the dispatch ID (e.g. because the compiler knew about the dispatch ID at compile time)
    • Or the caller only knows the name of the function, in which case a lookup is made in a table that maps function names to dispatch IDs
  • Automation interfaces use dynamic typing, i.e. parameters and return value are encapsulated into a generic data type "Variant"


Considerations

  • The "vtable" calling mechanism is faster because no lookups have to be made at runtime to identify the function to be called. Also, there is no overhead from encapsulating parameters and return type into a generic data type "Variant".
  • The "Automation" calling mechanism is the only way how interpreted languages can make COM function calls, because program code written in an interpreted languages only contains the name of the function to call.
  • Automation interfaces are more open to change because the coupling between caller and implementation is not as tight as with vtable interfaces. vtable interfaces are basically immutable, i.e. once defined they must never be changed - unless the calling executable is recompiled using the changed definition. It may be possible to add new methods to a vtable interface, but I have not researched this.


Dual interfaces

The designer of a COM interface can elect to support only the "vtable" or the "Automation" function call mechanism, or both. An interface that supports both mechanisms is called a dual interface.


COM interface types IUnknown and IDispatch

IUnknown

  • IUnknown is the root interface in all of COM - every other interface is derived from IUnknown!
  • IUnknown is used to implement vtable interfaces
  • IUnknown defines 3 functions
    • QueryInterface() - this function is used to obtain a pointer to an interface from the IID, i.e. the interface's UUID
    • AddRef() and Release() - these functions are used to reference count a COM object
  • Also see the Wikipedia entry [6] and the MSDN article [7] for IUnknown


IDispatch

  • IDispatch is used to implement Automation interfaces
  • IDispatch extends IUnknown by 4 additional functions
    • GetIDsOfNames() - this function is used at runtime to map a method name to its DISPID
    • Invoke() - this function calls the function that matches a DISPID
    • GetTypeInfoCount() and GetTypeInfo() - these functions provide information about the interface, e.g. to determine DISPIDs for DISPID binding (see next section for DISPID binding)
  • Also see the Wikipedia entry [8] and the MSDN article [9] for IUnknown


Early binding, late binding, DISPID binding

Late binding

  • A client that uses IDispatch to call functions uses late binding
  • At runtime the DISPID of the function to be called is determined by IDispatch::GetIDsOfNames()
  • The function represented by the DISPID is then called using IDispatch::Invoke()


DISPID binding

  • This term is defined by [4]
  • DISPID binding is a hybrid form of late binding and occurs when the DISPID of a function is already known at compile time
  • If a compiler has information about a COM interface, it can insert the DISPID of a function being called into the compiler output
  • Thus, at runtime the call to IDispatch::GetIDsOfNames() is not necessary
  • Information about a COM interface is typically available at compile time through a type library
  • The Wrox Press book "COM IDL & Interface Design" [10] confusingly calls DISPID binding "early binding"


Early Binding

  • A client that uses IUnkown to call functions uses early binding
  • Early binding thus simply uses the vtable call mechanism
  • On page 45, the Wrox Press book "COM IDL & Interface Design" [10] confusingly calls this "very early binding", or "vtable binding"


IDL and vtable/Automation interfaces

IDL uses different syntax for defining the two interface types.


vtable interfaces:

  • vtable interfaces are defined using the "interface" statement. Example:
interface IMyInterface : IUnknown
  • Also I don't recall where, I have seen vtable interfaces be called "COM-style interfaces" in order to distinguish them from "Automation-style" interface


Automation interfaces

  • Automation interfaces are defined using the "dispinterface" statement. IDispatch is implied, i.e. the name of this interface must not be specified. Example:
dispinterface IMyInterface
  • Automation interfaces are sometimes also called "dispinterfaces"


Dual interfaces

  • A dual interface is defined with the same syntax as a vtable interface, but it derives from IDispatch instead of from IUnknown. Example:
interface IMyInterface : IDispatch
  • There is more syntactic sugar that is not shown here (I believe the attribute "dual" is also involved)


TODO Provide more complete examples. What about .NET?


Type libraries

Overview

The description of one or more COM types (interfaces, classes, etc.) can be stored as a binary representation in a so-called "type library", sometimes abbreviated to "TLB". It can also be said that a type library contains meta data about COM types.

The conversion into a binary representation is made for two reasons:

  • A binary representation can be processed more efficiently by consumers than a source code representation such as an .idl file
  • More importantly, though, the binary type library format is a standard format; type libraries can be generated from a variety of sources, but consumers need only handle this one standard format


Sources

Theoretically there is an unlimited number of possible sources for type libraries. For instance, an exotic programming environment might have its own format how to define COM types, but would then compile this definition into a standard type library. In practice I know of two sources:

  • .idl files. When the MIDL compiler compiles an .idl file, one of its outputs is a .tlb file that contains a type library that represents the definitions in the .idl file.
  • .NET assemblies. The tlbexp.exe utility [11] scans a .NET assembly for types marked (with an attribute) to be used for COM and generates a .tlb file that contains a type library that represents the types in the assembly. Note: The regasm.exe utility [12] also has the capability to generate type libraries.


Deployment

Type libraries can be deployed in one of two forms:

  • As a standalone .tlb file (the standalone file may have a different extension, for instance .olb has been sighted in the context of MS Office)
  • As a resource compiled into a native .dll or .exe


Consumers

A type library is generated for further processing by various consumers. The following is a list of typical type library consumers:

  • Language compilers: For instance, The #import preprocessor directive in Visual C++ (see the #import section for details).
  • The tlbimp.exe utility [13]: Processes a type library to create an Interop assembly that can then be referenced by code written in a .NET language, in order to make COM calls to the types described in the type library. Actually this is just a special known case of the "language compilers" consumer type.
  • COM client at runtime, using late binding: Using information from the type library, the COM library, acting on behalf of the late binding client, can determine which interfaces a COM object supports, and invoke the COM object's interface methods.
  • Visual Studio IntelliSense, at design time: Provides code completion.
  • Visual C++ Class Wizard, at design time: Uses the information to make an ActiveX control available in the dialog editor toolbar.


#import in Visual C++

The #import preprocessor directive [14] in Visual C++ consumes type libraries and generates various pieces of code required by the C++ COM client code to create and interact with COM objects. #import works very similar to #include, here are some simple examples:

#import <foo.tlb>   // import a standalone type library
#import <foo.dll>   // import a type library compiled into a .dll as a resource
#import <foo.exe>   // ditto for .exe
#import "foo.tlb"   // double quote syntax works as well, just as with #include


The #import directive looks for the specified file in these locations:

  • Folder that contains the importing file
  • All folders in PATH
  • All folders in LIB
  • All include folders (/I)


#import "de-compiles" the type library and generates two intermediate files, a .tlh and a .tli file. Some details:

  • .tlh = Type library header file, also known as primary header file
  • .tli = Type libary inline file, also known as secondary header file
  • The files contain C++ code for all the types in the type library
  • In addition to the raw interface types, the compiler generates smart pointer classes that wrap the interface types. These are recognizable from the "Ptr" suffix, e.g. IFooPtr.
  • These smart pointers are incredibly useful for cleanly managing the lifecycle of COM objects: They automatically increase and decrease the reference count of the COM objects they wrap. See the Wikipedia pages for Smart pointer and RAII for information about the concept.
  • The primary header file (.tlh) includes the secondary header file (.tli)
  • The primary header file is automatically #included into the source code at the location of the #import directive


The #import can be used with many additional attributes. The MSDN reference for #import [14] has all the details.


Registration

If the type library exists as a standalone .tlb file, I only know the undocumented .NET utility regtlibv12.exe to register/unregister the .tlb file:

C:\Windows\Microsoft.NET\Framework\v4.0.30319\regtlibv12.exe    \path\to\foo.tlb
C:\Windows\Microsoft.NET\Framework\v4.0.30319\regtlibv12.exe /u \path\to\foo.tlb

Unfortunately on some systems regtlibv12.exe seems to be missing, this StackOverflow post attempts to provide an explanation.


For .NET assemblies the regasm.exe utility [12] can be used to generate AND register the type library:

regasm /tlb foo.dll


If the type library is embedded as a resource into a native .dll (in-process server) or .exe (out-of-process server), the self-registration process of these files can be used to register the type library. For this to work, the code that performs the self-registration process must be written specifically to register the type library.

  • .dll
    • If the module uses ATL, then ATL::CComModule::RegisterServer() does the job
    • If the module uses OLE, then AfxOleRegisterTypeLib() performs the registration
  • .exe
    • TODO code example


Code example for .dll self-registration using CComModule:

// Define this global variable
CComModule _Module;


STDAPI DllRegisterServer(void)
{
    // Registers ProgIDs, CoClass CLSIDs and adds other stuff to the registry.
    // This registration only works if you have registrar scripts (.rgs) in your
    // project's resources with entries like this:
    //   1 REGISTRY "YourRegistrarScript.rgs"
    //
    // Also registers the type library and all interfaces in the type library.
    // The type library must be in your project's resources with an entry
    // like this:
    //   1 TYPELIB "YourTypeLibrary.tlb"
    BOOL registerTypeLibrary = TRUE;
    return _Module.RegisterServer(registerTypeLibrary);
}

STDAPI DllUnregisterServer(void)
{
    // Unregisters everything by running registrar scripts (.rgs) in "removal"
    // mode. Also removes the type library and all interfaces in the type
    // library.
    BOOL unregisterTypeLibrary = TRUE;
    return _Module.UnregisterServer(unregisterTypeLibrary);
}


Code example for .dll self-registration using COleObjectFactory and AfxOleRegisterTypeLib():

// Use this snippet of preprocessor magic for every CoClass in the project
IMPLEMENT_OLECREATE(ClassImplementingCoClass, "YourProgID", 0x12345678, 0x90ab, 0xcdef, 0x12, 0x34, 0x5, 0x6, 0x78, 0x90, 0xa, 0xbc)

// This statement generates code from the type library and includes the
// header file which contains GUID variables that you can reference.
#import "YourTypeLibrary.tlb" named_guids


STDAPI DllRegisterServer(void)
{
    // Required for AfxGetInstanceHandle
    AFX_MANAGE_STATE(AfxGetStaticModuleState())

    // Registers ProgIDs and CoClass CLSIDs from your IMPLEMENT_OLECREATE declarations
    BOOL success = COleObjectFactory::UpdateRegistryAll();
    if (! success)
        return S_FALSE;

    // Registers type library and all interfaces in the type library
    // The type library must be in your project's resources with an entry
    // like this:
    //   1 TYPELIB "YourTypeLibrary.tlb"
    // The type library GUID is available due to the #import statement above.
    success = AfxOleRegisterTypeLib(AfxGetInstanceHandle(), YourTypeLibraryNamespace::LIBID_typelibraryname);
    return success ? S_OK : S_FALSE;
}

STDAPI DllUnregisterServer(void)
{
    // Required for AfxGetInstanceHandle
    AFX_MANAGE_STATE(AfxGetStaticModuleState())

    // This does NOT work - ProgID and CoClass CLSIDs are not removed from the
    // registry
    BOOL success = COleObjectFactory::UnregisterAll();
    if (! success)
        return S_FALSE;

    // This does NOT work - The type library itself is removed from the registry,
    // but the interfaces in the type library are NOT removed.
    success = AfxOleUnregisterTypeLib(YourTypeLibraryNamespace::LIBID_typelibraryname);
    return success ? S_OK : S_FALSE;
}

TODO: Code example for .dll self-registration using the newer CAtlDllModule (because CComModule is deprecated)


Versioning

Type libraries have a version. A type library version consists of 2 components only:

  • Major version number
  • Minor version number

TODO code examples for an .idl file and a .NET assembly.

When a type library is registered in the Windows Registry, its version number is recorded in the Windows Registry in hexadecimal format. For instance, a type library version "12.0" is recorded as "c.0". I have not found any canonical reference for this behaviour, but there exists a bug report for the .NET 1.0 version of regasm.exe which states that the utility erroneously records the version number in decimal format.

TODO explain why the type library version is important; at least one case is known: late binding.


MIDL

Overview

The traditional way how to define COM interfaces is by specifying them with MIDL language, the Microsoft Interface Definition Language. MIDL is Microsoft's extension of OMG IDL, an official standard. The extension adds support for COM and DCOM.

MIDL files are compiled using the MIDL compiler.

The MIDL compiler is available in native projects only. MIDL does not play a role when you design COM interfaces in a .NET language: There you simply write your regular interfaces and classes, then you use attributes to mark an interface/class for use as COM interface/class. COM-specific tools that later process the assembly can pick out the marked interfaces/classes and perform whatever tasks are needed. See the .NET / C# section for details.


The MIDL language

The MIDL syntax is based on the C programming language. There are two files involved [15]:

  • An .idl and an .acf file. contain attributes that direct the generation of the C-language stub files that manage the remote procedure call (RPC)
  • The .idl file contains the software interface, i.e. stuff that is independent of hardware or operating system
  • The .acf file contains hardware/OS specific characteristics of an interface
  • In practice I have never seen an .acf file, so I have not investigated these any further.
  • MSDN has a MIDL language reference [16]


The MIDL compiler

TODO


.idl files

  • The .idl file is the "Interface Definition Language File" [17]
  • The .idl file contains one or more interface definitions
  • An interface definition consists of
    • Interface header: Contains attributes of the interface as a whole, e.g. interface UUID, interface version.
    • Interface body: Contains function prototypes and the data types used for calling those functions. Also contains imports, pragmas, constants and type declarations
  • Example
[
  // Interface header: Interface attributes go here
]
interface INTERFACENAME
{
  // Interface body
}
  • Interfaces can have these base types (TODO: provide more details about these base types):
    • IUnknown
    • IDispatch
    • Dual
  • For details see the MIDL language reference [16]


.tlb files

The MIDL compiler creates a .tlb file which is a binary version of the .idl file. This is called a "Type Library". See the Type libraries section for details.


The COM library

The COM library [18] provides certain essential services at runtime:

  • For COM clients: General functionality to create COM objects
  • For COM servers: General functionality to provide COM objects
  • So-called "implementation-locator services": These are functions that find an implementation based on its CLSID (or ProgID)
  • Transparent RPC calls, should the server be located on a remote system


TODO more information about initializing/uninitializing


The Windows Registry

Location

COM information is stored in a number of locations in the Windows registry:

  • System-wide COM registrations are stored under
HKEY_LOCAL_MACHINE\SOFTWARE\Classes
  • User-specific COM registrations (which override the system-wide settings) are stored under
HKEY_CURRENT_USER\Software\Classes


The two trees listed above are merged under

HKEY_CLASSES_ROOT

Apparently this merged view is primarily intended for backwards compatibility with earlier versions of COM, and old COM utilities that still rely on the presence of this key. It is unclear whether HKCR should be used or not. This is the link to the relevant MSDN article.


TODO what about 64-bit systems?


General tree structure

The tree below any of the above locations looks like this:

root
+-- ProgID1
+-- ProgID2
+-- [...]
+-- CLSID
|    +-- CLSID1
|    +-- CLSID2
|    +-- [...]
+-- Interface
|    +-- IID1
|    +-- IID1
|    +-- [...]
+-- TypeLib
    +-- LIBID1
    +-- LIBID2
    +-- [...]


ProgID keys

Each ProgID key refers to a single COM class. The information below the ProgID key is minimal and basically maps the ProgID to the CLSID of the COM class so that when a COM object is instantiated using a ProgID, the COM library knows under which CLSID registry key it can look up the actual information it needs for instantiation.

A full description of the content of a ProgID key is available from MSDN [5]


Keys below "CLSID"

Each key below the "CLSID" main key is a concrete class ID that identifies a single COM class. The information below the class ID key basically defines where the COM server that provides the COM class is located. A full description of the content of a class ID key is available from MSDN [19], but the most important subkeys are

  • InprocServer32 (used only if the COM server is an in-process server, i.e. the COM server resided in a .dll file)
  • LocalServer32 (used only if the COM server is an out-of-process server, i.e. the COM server resides in an .exe file)


Keys below "Interface"

Each key below the "Interface" main key is a concrete interface ID that identifies a single COM interface.

I don't know the exact purpose of interface ID registry keys, but my current rough understanding is that they are only used for dynamic lookup of information at runtime when late binding is used for COM function calls.

A full description of the content of an interface ID key is available from MSDN [20]


Keys below TypeLib

Each key below the "TypeLib" main key is a concrete library ID that identifies a single COM type library.

I don't know the exact purpose of library ID registry keys, but my current rough understanding is that they are used in at least two scenarios:

  • For dynamic lookup of information at runtime when late binding is used for COM function calls
  • At development time when

I have not been able to find an MSDN article that describes the content of a library ID registry key.


Registering components

If one follows the rules that a set of COM registration entries must obey, then it is perfectly possible to manually create all the necessary keys and values in the Windows Registry. Obviously this is a tedious, and moreover error-prone process and therefore quite impractical. For this reason, COM registration is usually performed in one of several automated ways. The following short list provides an overview of common methods, the details for each method are provided in subsequent sections.

  • COM servers implemented in native .dll and .exe files can be equipped with the capability of self-registration, i.e. there is a piece of code in the .dll or .exe file that knows which keys/values it needs to place in the Windows Registry to register the COM types that the COM server provides. A well-known interface exists that can be used by an external source to trigger the self-registration process.
  • COM servers implemented in .NET assemblies are registered using the regasm.exe command line utility


A simple approach for quick & dirty registration in a known environment is to store pieces of COM registration information as registry fragments (.reg files). This is not recommended for production use, however, because there might be subtle and unexpected differences between the system where you created the registry fragment snapshot, and the system where you are restoring the snapshot. One not-so-subtle difference might be that registry key locations differ between 32-bit and 64-bit systems.


Self-registration of native COM servers

Since COM servers already possess all the information about the COM types that they provide, it makes sense to equip them with the capability to add this information to the Windows Registry. This process is called "self-registration" [21]. The way how self-registration is triggered necessarily differs between in-process and out-of-process COM servers, but the actual self-registration process is the same for both server types.


How to trigger self-registration of an in-process COM server (implementation is in a .dll):

  • The .dll must contain two well-known public ("exported") global functions (aka "entry points") that can be called to register/unregister COM types
  • DllRegisterServer() is the function to register COM types
  • DllUnregisterServer() is the function to unregister COM types
  • Anyone can call these functions to initiate the registration/unregistration process, but there is a well-known utility program provided by Microsoft on all Windows systems that does just that: regsvr32.exe.
  • The implementation of the two functions is typically very simple and just calls a function from some helper library that knows how to register/unregister stuff. The following example shows how a native C++ implementation looks like: TODO provide example and explain whether this uses ATL or something else. Possibly provide an OLE example as well.
  • The usual way how the two global functions are made public is by way of a Module Definition File (.def file). For instance:
;contains the list of functions that are being exported from this DLL
DESCRIPTION     "Simple COM object"
EXPORTS
    DllRegisterServer      PRIVATE
    DllUnregisterServer    PRIVATE


How to trigger self-registration of an out-of-process COM server (implementation is in an .exe):

  • The .exe must parse its command line and process two well-known command line options for registering/unregistering COM types
  • /RegServer is the command line option to register COM types
  • /UnregServer is the command line option to unregister COM types
  • Anyone can launch the .exe with these command line options to initiate the registration/unregistration process. Unsurprisingly, this is so simple that no special utility program exists to help with this task.
  • As with in-process COM servers, the handling of the command line options is typically very simple and calls some helper library functions to register/unregister stuff. TODO: examples


For both types of COM servers, the actual self-registration process works like this:

  • The .vcxproj code project contains a so-called "ATL registrar script", which is an .rgs file that describes the required COM registration entries
  • The .vcxproj code project also contains an .rc file which references the .rgs file
  • When the project is compiled the .rgs file is compiled as a resource into the .dll or .exe
  • Optionally for type libraries
    • The code project contains a MIDL compile phase that generates a type library (.tlb file) from an .idl file
    • The .rc file references the .tlb file
    • When the project is compiled the .tlb file is compiled as a resource into the .dll or .exe
  • DllRegisterServer() or the handler for the /RegServer command line option call CComModule::UpdateRegistryFromResource()
    • TODO: Is this true?
  • DllRegisterServer() or the handler for the /RegServer command line option call CComModule::RegisterServer()
  • This function loads the .rgs resource and modifies the Windows Registry with the content of the resource
  • In addition this function can also be told to register the .tlb resource in the Windows Registry


.rgs files are so-called "registrar scripts", because they are scripts processed by the ATL registrar service which knows how to populate the Windows Registry with keys/values according to the directives in the registrar script. See MSDN [1] for further reference how .rgs files work. Stack Overflow also has some useful info, for instance this question.


.NET assemblies

The regasm.exe command line utility is used to register COM servers that are implemented in a .NET assembly:

regasm /codebase \path\to\foo.dll

The option /codebase is not necessary if the assembly is located in the GAC. However, in the example above we assume that the assembly is not in the GAC, and so /codebase must be specified. When used the option tells regasm.exe to write the full filesystem path into the Windows registry so that the system can find the assembly when it is needed by a client. Note that regasm.exe will complain if the assembly does not have a strong name, but luckily this warning can be safely ignored.

An assembly can be unregistered as well:

regasm /unregister \path\to\foo.dll


Automatic registration as part of the build

When developing a COM server it may be tedious to re-register the server after every change. One solution to this problem is to add a post-build step to the project that runs regsvr32.exe or regasm.exe on the build result. VC++ projects and .NET projects both offer a built-in alternative to a handcrafted post-build step:

  • .vcxproj: Configuration Properties > Linker > General > Register Output = Yes
  • .csproj: Build > Register for COM interop = True


I strongly advise against using these convenience auto-registration options because if COM registration is not carefully controlled, over time loads of stale entries will accumulate in the Windows registry of the build machine. This is especially true if there is a build server that assigns a new version number for each build: In such a scenario a new set of registry entries will be created for each and every build! Moreover, auto-registration is usually not necessary at all on a build server because the COM component will never run on such a system.


Types of COM servers

In-process and Out-of-process

There are two types of COM servers which are distinguished by the way how a COM client interacts with a COM server:

  • In-process COM server: The COM server implementation resides in a .dll file. When the COM client creates one of the server's COM objects, the COM server DLL is loaded into the address space of the COM client process.
  • Out-of-process COM server: The COM server implementation resides in an .exe file. When the COM client creates one of the server's COM objects, the COM server executable is launched as a separate process.

Out-of-process COM servers can be located both on the local machine or on a remote machine, while in-process COM servers can be located only on the local machine.


Differences in the Windows Registry

  • If a COM class is provided by an in-process COM server, its CLSID entry in the Windows Registry contain a sub-key named "InprocServer32" (or "InprocServer" for 16-bit COM servers)
  • If a COM class is provided by an out-of-process COM server, its CLSID entry in the Windows Registry contain a sub-key named "LocalServer32" (or "LocalServer" for 16-bit COM servers)


DLL surrogates

Out-of-process COM servers have some benefits over in-process COM servers, such as:

  • A COM client is better isolated from errors that occur in the COM server. If an unhandled exception occurs in an in-process COM server it will crash the COM client process, whereas if the same happens in an out-of-process COM server the only thing that happens is that the current function call terminates with an error (of course, the COM client may still crash if it has insufficient error handling, but that's another issue).
  • An out-of-process COM server that runs in its own process can have different security attributes than the COM cient process

Nevertheless, it might still be desirable for some reason to implement the COM server as an in-process server. The solution to have the benefit of two worlds is to create a so-called "DLL surrogate": This is an out-of-process server (.exe) that is a simple wrapper around the in-process server (the .dll). Apparently COM provides a default surrogate process which is supposed make creation of a DLL surrogate easy, but it is also possible to write a custom DLL surrogate. This MSDN article is the starting point for information about DLL surrogates: [22].


Native C++ implementations

MFC

TODO

Only a part of MFC has to do with COM, while most of ATL is concerned with COM

articles


ATL

TODO

Fundamentals of ATL COM Objects: [18]


COM server

TODO

A COM server can serve many COM objects to its clients. This section explains how to prepare a .dll or an .exe project to function as an in-process or out-of-process COM server. The next section explains what must be done so that the basically functioning COM server is actually serving COM objects.

Distinguish between a modern and an obsolete implementation. Obsolete implementations use CComModule, modern implementations use ATL module classes (http://msdn.microsoft.com/en-us/library/79kd7a00.aspx).

Add details about what kinds of marshaling support there are: http://msdn.microsoft.com/en-us/library/ms686605.aspx

  • CoGetClassObject(), DllGetClassObject() and DllCanUnloadNow()
  • http://stackoverflow.com/questions/2919525/ccommoduleunlock
  • CoGetClassObject() is called to get the class object of the specified CLSID
  • The function automatically loads the .dll or .exe which contains the class object according to the Windows Registry
  • If it's a .dll
    • The .dll must implement and export the global function DllGetClassObject()
    • The DllGetClassObject() implementation must return a matching class object. See the MSDN page for DllGetClassObject() for an example implementation.
    • Using the class object it is now possible to create concrete instances of the COM class
    • It is also possible to call CoCreateInstanceEx() for a shortcut
    • The .dll must also implement and export the global function DllCanUnloadNow()
      • The purpose of DllCanUnloadNow() is to let the caller know whether there are still COM objects alive that originate from the COM server in this .dll
      • DllCanUnloadNow() must return S_OK if the object count for the .dll is 0, and return S_FALSE if the object count is > 0
      • The object count is usually stored in a CComModule instance, and it can be queried like this (TODO: CComModule is deprecated, what's the modern way to do this?)
STDAPI DllCanUnloadNow(void)
{
  AFX_MANAGE_STATE(AfxGetStaticModuleState());
  return (_Module.GetLockCount()==0) ? S_OK : S_FALSE;
}
      • When new COM objects are created, they call CComModule::Lock() behind the scenes
      • When new COM objects are destroyed, they call CComModule::Unlock() behind the scenes
      • DllCanUnloadNow() is periodically called by the system (TODO where did I read this?)
      • DllCanUnloadNow() is also called when CoFreeUnusedLibraries() is called
        • CoFreeUnusedLibraries() exists only for the sake of compatibility
        • Originally it was recommended that the function be periodically called to free resources
      • http://msdn.microsoft.com/en-us/library/ms679712.aspx
    • If a .dll does not implement DllCanUnloadNow() it is unloaded only when CoUninitialize() is called
  • If it's an .exe
    • TODO


COM object

TODO


COM client

TODO

Mention smart pointers here

Determine the CLSID of a COM class from a ProgID:

CLSID clsID;
HRESULT hRes = CLSIDFromProgID(L"foo", &clsid);  // L prefix -> unicode/wide characters

Create an instance of a COM class, typed to a specific interface (TODO: what's this about CLSCTX_ALL?):

CLSID clsID;
[...]  // initialize CLSID

// Example 1
DWORD clsContext = CLSCTX_ALL;  // type of COM server that should manage the object
IFoo* pFoo1;
HRESULT result1 = CoCreateInstance(clsID, NULL, clsContext, __uuidof(IFoo), (void**)&pFoo1);

// Example 2
IFoo* pFoo2;
HRESULT result2 = CoCreateInstance(clsID, NULL, clsContext, IID_IFoo, (void**)&pFoo1);

Cast a COM object to an interface

// Somehow create object and store instance in pObj
[...]

// Example 1
IFoo* pFoo = NULL;
HRESULT result = pObj->QueryInterface(__uuidof(IFoo), (void**)&pObj);
if (result) { [...] }

// Example 2
IFoo** ppFoo;
pObj->QueryInterface(IID_IFoo, (void**)ppFoo);


.NET / C#

Implement COM server

TODO

assembly has a GUID which is used for the .tlb (TODO is this true?)

COM types need to be marked and made visible to become available via COM

  • either all COM types
[assembly: ComVisible(true)]
  • or individual types (TODO: how?)

type library (.tlb file) is automatically generated by the compiler as soon as at least one COM type is made visible (TODO is this true?). the .tlb file is placed in the same folder as the assembly.

other IDL attributes can be added with this syntax

[ProgId("Foobar"), ClassInterface(ClassInterfaceType.None)]

A problem is that the type library generated by the C# compiler cannot be used to create an Interop assembly. This makes it difficult to have a .NET COM client if the COM server is also implemented in .NET. There is an interesting article on Code Project that goes into great depth about how .NET COM servers can be built [23]. The article also explains how to circumvent the type library problem. TODO: How? If remember correctly we must specify the interfaces in .idl and create the type library from there.


Interop .NET <-> COM

References:

Overview

  • .NET and COM have different approaches regarding memory and lifetime management
  • Data types are also different
  • .NET contains support to bridge these differences between the two worlds, in BOTH directions!
  • Calls from .NET to COM use the so-called "Runtime Callable Wrapper" (RCW)
  • Calls from COM to .NET use the so-called "COM-Callable Wrapper" (CCW)

Runtime Callable Wrapper (RCW)

  • The RCW translates calls from .NET into the COM world
  • The RCW is created at runtime by the CLR (Common Language Runtime) using the information from an interop assembly (more on these follow further down)
    • The executable code for the RCW is NOT stored in the interop assembly - the interop assembly merely provides metadata for creating the necessary objects that make up the RCW
  • It's the job of the RCW ...
    • ... to correctly manage reference counting of COM objects
    • to marshal parameter and return values (e.g. translate HRESULT return values into .NET exceptions)

COM-Callable Wrapper (CCW)

  • The CCW translates calls from the COM world into .NET
  • The CCW is created at runtime by the CLR (Common Language Runtime) when the first COM call is made into a .NET assembly
    • Assumption from the above information about the RCW: The executable code for the CCW is NOT stored in the assembly that provides the COM server - the assembly merely provides metadata for creating the necessary objects that make up the CCW
  • It's the job of the CCW ...
    • ... to translate reference counting expected by COM objects into the garbage-collected environment of the CLR
    • ... to marshal parameter and return values


Interop assemblies

Some basics:

  • An interop assembly is...
    • An interop assembly is a .NET assembly
    • An interop assembly contains meta data that corresponds to the types of the original type library from which the interop assembly was created
    • .NET projects require interop assembly meta data because they cannot directly handle COM type libraries or COM .dll/.exe files
    • Interop assemblies can be seen as a different format for COM type descriptions, a format that is understood by .NET
  • An interop assembly is not...
    • An interop assembly does NOT contain executable code (IL code, Intermediate Language code)
    • An interop assembly does not embed the type library from which it was created (as one might assume)
    • An interop assembly is not a wrapper for the actual COM server - when you create a COM object in .NET code, you still get it directly from the COM server
    • An interop assembly does not perform marshaling - this is the task of the RCW (Runtime Callable Wrapper), which "lives" in the calling .NET assembly


Design time

  • A .NET project that consumes COM types requires an assembly reference to an interop assembly that contains the metadata about the COM types
  • The content of an interop assembly can be inspected with ildasm.exe (the MSIL disassembler)
  • The .NET project that consumes COM types does so via early binding


Runtime

  • The CLR requires an interop assembly at runtime to create the RCW (Runtime Callable Wrapper), which performs marshaling from .NET to COM
  • Instead of physically distributing an interop assembly file along with the actual consuming .NET assembly, it is possible to embed the necessary metadata into the consuming .NET assembly at compile time (TODO verify this and provide more details)
  • Rough sketch of the sequence of events
    • .NET code calls COM function
    • The CLR loads metadata, either from an interop assembly that is physically present, or from the calling assembly itself which embeds the necessary metadata
    • Using the metadata, the CLR generates the marshaling code (= the RCW) that matches the COM call on-the-fly
    • The marshaling code is executed and the COM function is called


How to generate an interop assembly

  • Explicit generation (tlbimp.exe)
    • The input for an interop assembly is the COM type information in a type library (which was earlier generated from an .idl file)
    • The command line utility tlbimp.exe is used to perform the generation, for instance
tlbimp \path\to\FooSDK.tlb /namespace:Interop.FooSDK /out:\path\to\Interop.FooSDK.dll
    • The interop assembly can now be added to a consuming .NET project as an assembly reference
  • Implicit generation (reference to COM)
    • If a COM component is already registered on the development system, a reference to that COM component can be added to a .NET project
    • In the background, Visual Studio now automatically creates an interop assembly
    • The interop assembly is placed into the "bin" folder of the project
    • Presumably (TODO verify) the COM component must also register a type library for the interop assembly to be created successfully


TODO mention that an interop type library (see next section) generated from a .NET assembly cannot be used to create an interop assembly! I still don't know why this is the case, it makes COM calls from one .NET assembly into another .NET very difficult!


Interop type libraries

Basics

  • An interop type library contains a description of the COM types that a .NET assembly provides
  • An interop type library is used in non-managed programming languages to get type information about the COM types provided by the .NET assembly (TODO: does this constitute early binding?)


How to generate an interop type library

  • The command line utility tlbexp.exe is used to perform the generation, for instance
tlbexp dotNetAssembly.dll
  • regasm.exe can also be used to create an interop type library; in this case the type library is also registered in the Windows Registry. Example:
regasm dotNetAssembly.dll /tlb


Late binding

This code snippet shows how a C# client instantiates a COM object and invokes one of its functions via late binding.

public void doComWithLateBinding()
{
    string progID = "Foo";
    Type progIDType = Type.GetTypeFromProgID(progID);
    object comObject = Activator.CreateInstance(progIDType);
    Type comObjectType = comObject.GetType();

    try
    {
        object[] inputParams = {
            123,
            "bar",
        };
        var result = comObjectType.InvokeMember(
            "doIt",
            System.Reflection.BindingFlags.InvokeMethod,
            null,
            comObject,
            inputParams);
        MessageBox.Show(string.Format("result = {0}", result));
    }
    catch (System.Exception exception)
    {
        MessageBox.Show(string.Format("exception = {0}", exception.Message));
    }

    if (System.Runtime.InteropServices.Marshal.IsComObject(comObject))
    {
        System.Runtime.InteropServices.Marshal.ReleaseComObject(comObject);
    }
}


Event Sinks

References

  • Code Project: An introduction to event sinks in C++ in the context of ATL COM Add Ins [24]
  • MSDN: Events in COM and Connectable Objects [25]
  • MSDN: Event Sink Maps [26]
  • MSDN: Supporting IDispEventImpl [27]
  • MSDN: COM Events in .NET [28]


Basics

The concept is a relatively simple one:

  • A COM class may declare that it generates events. This COM class is the event source.
  • A different COM class may declare that it would like to receive notifications when an event occurs. This COM class is the event sink.
  • In other words, these are just new terms for the Observer design pattern, or an event listener model
  • The event source declares the interface, the event sink implements the interface


Implementing an event sink

In order to implement an event sink in C++

  • The event sink class must be derived 1-n times from IDispEventImpl
    • The event sink class must have one inheritance specification per event interface that it wants to support
    • This is possible because IDispEventImpl is a template class
    • The event sink class must give a unique ID to each inheritance specification
    • Example
#import <TypeLibraryID1.tlb>
#import <TypeLibraryID2.tlb>

class FooBarEventsSink : [...],
                         public IDispEventImpl <1, FooEventsSink, &__uuidof(FooEvents), &__uuidof(TypeLibraryID1), 1, 0>,
                         public IDispEventImpl <2, BarEventsSink, &__uuidof(BarEvents), &__uuidof(TypeLibraryID2), 1, 0>
[...]
    • Obviously, the type libraries that contain the event interface specification must be imported before the event sink declaration
  • The event sink class must define the handlers that should be invoked when one of the events occur
    • The event sink class defines a so-called "event sink map" to map each event to its handler
    • As usual with ATL, this is done with preprocessor macros
    • Example
BEGIN_SINK_MAP(FooBarEventsSink)
    SINK_ENTRY_EX (42, __uuidof(FooEventsSink), 1, OnFooEvent)  // 42 = Control ID, 1 = Event ID (DispID in the event interface)
END_SINK_MAP()
    • Handler methods must have the _stdcall calling convention
    • Handler methods must have the same arguments as the event method
  • Somewhere there must be code that registers an event sink object as an "observer"
    • Registering/unregistering is called advising/unadvising


TODO: Finish this section. An eample for advising/unadvising code is required, as well as an actual event handler example.


Implementing an event source

TODO


Apartments / Threading

References

  • MSDN Processes, Threads, and Apartments: [29]
  • What are these "Threading Models" and why do I care? [30]
  • Wikipedia: Threading in COM: [31]
  • MSDN Single-Threaded Apartments: [32]
  • MSDN In-Process Server Threading Issues: [33]
  • Registry key declaration for "InprocServer32": [34]


Basics

  • All COM objects in a process are grouped into so-called "apartments" [29]
  • A COM object "lives" in exactly one apartment [29]. See the next section for a definition of "lives".
  • Once created a COM object remains in the same apartment for its entire lifecycle [31]
  • The purpose of the apartment model is to ensure that COM clients and COM servers/objects can safely communicate with each other, even if they have different threading models. See further down for a definition of "threading model".
  • There are 2 types of apartments: Single-Threaded Apartments (STA) and Multithreaded Apartments (MTA)
  • The first apartment created in a process is called the "Main Apartment" [29]


A process and its apartments

A process has the following apartments [29] :

  • 0 or 1 MTA
  • 0-n STAs
  • The process always has at least 1 apartment
  • The first thread in a process that calls CoInitializeEx() creates the Main Apartement [29]


Definition of "to live in an apartment"

A COM object "lives" in exactly one apartment.


For COM objects that "live" in an STA this means: [29]

  • The functions of the COM object can only be called in the context of the thread that belongs to the apartment where the COM object "lives"
  • If a thread outside of the COM object's apartment wants to call a function of the COM object, the function call must go through a proxy. COM automatically provides a proxy for every apartment.
  • The proxy's job is to "translate" apartment-external function calls into apartment-internal function calls
  • "Translate" here means, synchronize between the calling thread (outside of the apartment) and the called thread (inside the apartment)
  • How the proxy achieves synchronization is of secondary interest (e.g. it could do this via a mutex or some similar mechanism), the main thing to know is that the proxy's synchronization effectively causes the function calls to be made sequential, one after the other
  • The purpose of all this is to protect COM objects that were NOT programmed thread-safe from calls originating from different threads in multi-threaded COM clients


All this does NOT apply to COM objects that live in the MTA of a process: No synchronization is necessary for these COM objects because it is assumed that they have been programmed thread-safe. [29]. A call made to a function of a COM object that lives in the MTA of a process is "passed through", i.e. no synchronization occurs at all.


Single-Threaded Apartments (STA)

An STA consists of exactly 1 (one) thread. [29] It is unclear which threading model the thread of an STA uses (TODO).

The MSDN article "Single-Threaded Apartments" [32] posts the following "Rules for single-threaded apartments":

  • Every object should live on only one thread (within a single-threaded apartment).
  • Initialize the COM library for each thread.
  • Marshal all pointers to objects when passing them between apartments.
  • Each single-threaded apartment must have a message loop to handle calls from other processes and apartments within the same process. Single-threaded apartments without objects (client only) also need a message loop to dispatch the broadcast messages that some applications use.
  • DLL-based or in-process objects do not call the COM initialization functions; instead, they register their threading model with the ThreadingModel named-value under the InprocServer32 key in the registry. Apartment-aware objects must also write DLL entry points carefully. There are special considerations that apply to threading in-process servers. For more information, see "In-Process Server Threading Issues" [33].


TODO: Disseminate and understand these rules.


Multithreaded Apartments (MTA)

An MTA consists of 1-n threads. [29] All threads in an MTA use the "free-threading" threading model. [29].


Threading Models

A COM class that resides in an in-process COM server specifies the "Threading Model" that it supports in the Windows Registry. For instance:

HKEY_CLASSES_ROOT\CLSID\{4EABE692-98A1-4301-AC5A-C1781D0297A1}\InprocServer32\ThreadingModel = Both

The following values exist for the "ThreadingModel" attribute:

  • Apartment: The COM object can run in an STA only. Typically this means that the COM object was NOT programmed thread-safe.
  • Free: The COM object can run in an MTA only. This means that the COM object WAS programmed thread-safe. Why one would restrict the object to the MTA only is not clear.
  • Both: The COM object can run in an STA as well as in an MTA. The thread that creates the object determines the apartment that the object runs in. Because the COM object can run in an MTA, this means that the COM object WAS programmed thread-safe.
  • Neutral: The COM object totally ignores the threading model of its caller (TODO: What does this mean?)

Whenever possible, the COM object runs in the same apartment as the client that instantiates it. See the MSDN documentation for "InprocServer32" [34] for details what happens if this is not possible. TODO: The MSDN article states that if the "ThreadingModel" attribute is not present or is not set to a value, the COM object runs in the first apartment that was initialized in the process. This seems to be contradictory to some of the tables also present in the same article.


TODO: The Microsoft support document "INFO: Descriptions and Workings of OLE Threading Models" [35] has extensive information on threading models and apartments. At the end of the document there is an extremely useful table that shows in a single glance how calls are made for in-process COM servers when different apartments are involved. Integrate this information into this wiki page.


Note: The main MSDN article [29] mentions "free threads" and threads that use a "freee-threading model". This is misleading since it is the COM object and not the thread that has an attribute "threading model".


Creating and destroying apartments

The information in this section is from [30].

  • An apartment is created by a call to CoInitialize() or CoInitializeEx()
    • CoInitialize() always creates an STA
    • CoInitializeEx() with the parameter COINIT_APARTMENTTHREADED also creates an STA
    • CoInitializeEx() with the parameter COINIT_MULTITHREADED creates an MTA
  • The thread that calls CoInitialize() or CoInitializeEx() assigns itself to the apartment it creates
  • An apartment is destroyed by a call to CoUninitialize()
  • When an apartment is destroyed, all the COM objects that still live in the apartment at that time are destroyed as well


TODO: By definition a process can have at most one MTA. What happens if a second thread calls CoInitializeEx() with the parameter COINIT_MULTITHREADED? If this is not allowed, then how does a thread assign itself to the MTA if some other thread has already created the MTA?

TODO: What happens if a thread calls CoInitializeEx() several times in a row? What happens if a thread calls CoInitializeEx(), CoUninitialize(), and then CoInitializeEx() again?


Threads and apartments

A thread remains in the same apartment throughout its entire life. [31]


TODO: This information does not add up with the information from the previous section.


Function calls that cross apartment borders

TODO: Provide more details, especially about marshalling and about the windows message queue in STAs.


Stuff that is unclear

  • Which thread model is used by the thread in an STA?
  • After reading [30] it seems that no apartment exists if no thread ever calls CoInitialize() oder CoInitializeEx(). Is this correct?
  • The Wikipedia article [31] mentions "Neutral Apartments". I did not understand what that is.
  • The main MSDN article [29] mentions the "Main Apartment". Is the Main Apartment something special? Does it have an attribute and/or behaviour that makes it different from any of the other apartments?
  • How can I find out...
    • which apartments a process has
    • which apartment a thread belongs to
    • which apartment type a thread belongs to
    • which apartment a COM object belongs to
    • which apartment type a COM object belongs to
    • which thread a COM object belongs to
  • I don't fully understand the STA rules from the MSDN article about STAs [32]
  • The MSDN article about STAs [32] mentions that a process can be initialised as an "apartment model process". This implies that a process has an attribute which somehow defines whether or not a process uses apartments. It is unclear if this is truly so, or if the docs are badly written.
  • The Wikipedia article [31] says that a thread cannot change its apartment during its lifetime. But what happens if the thread first calls CoInitialize(), then CoUninitialize(), and finally CoInitializeEx() with the parameter COINIT_MULTITHREADED?
  • What is the "neutral" threading model that COM classes can specify in the Windows Registry?
  • The MSDN article "In-Process Server Threading Issues" [33] is a complete disaster. When I last read the article there were so many unclear / ambiguous wordings and even contradictory information that at some point I stopped trying to understand the stuff.


Further reading

COM and MFC
Only a few parts of MFC are concerned with COM, while most parts of ATL deal with COM. MSDN article.
Automation
MSDN article.
Automation in a DLL
MSDN article.
Could you explain STA and MTA?. This is a StackOverflow question with a very nice (and comprehensible) overview of the two apartment types.
Four basic "dangers" when making calls from the MTA. This is an interesting StackOverflow answer that states, for instance, that calls that go from the MTA to an STA and are shuffled through a proxy can be 10'000 times slower if the method being called is very "small".


References

  1. 1.0 1.1 Creating ATL registrar scripts: http://msdn.microsoft.com/en-us/library/8350a3tf.aspx
  2. MSDN entry point for COM: http://msdn.microsoft.com/en-us/library/ee663262.aspx
  3. MSDN guide that introduces COM: http://msdn.microsoft.com/en-us/library/ms690156.aspx
  4. 4.0 4.1 KB article that explains the different forms of binding: http://support.microsoft.com/kb/245115
  5. 5.0 5.1 Description of the content of a ProgID Windows registry key that identifies a COM class: http://msdn.microsoft.com/en-us/library/dd542719.aspx
  6. Wikipedia entry for IUnknown: http://en.wikipedia.org/wiki/IUnknown
  7. MSDN article for IUnknown: http://msdn.microsoft.com/en-us/library/ms680509.aspx
  8. Wikipedia entry for IDispatch: http://en.wikipedia.org/wiki/IDispatch
  9. MSDN article for IDispatch: http://msdn.microsoft.com/en-us/library/ms221608.aspx
  10. 10.0 10.1 COM IDL & Interface Design, Dr. Al Major, Wrox Press
  11. tlbexp.exe (Type Library Exporter): http://msdn.microsoft.com/en-us/library/hfzzah2c.aspx
  12. 12.0 12.1 regasm.exe (Assembly Registration Tool): http://msdn.microsoft.com/en-us/library/tzat5yw6.aspx
  13. tlbimp.exe (Type Library Importer): http://msdn.microsoft.com/en-us/library/tt0cf3sx.aspx
  14. 14.0 14.1 #import Directive (C++): http://msdn.microsoft.com/en-us/library/8etzzkb6.aspx
  15. The IDL and ACF files: http://msdn.microsoft.com/en-us/library/aa378708.aspx
  16. 16.0 16.1 MIDL language reference: http://msdn.microsoft.com/en-us/library/aa367088.aspx
  17. The Interface Definition Language (IDL) File: http://msdn.microsoft.com/en-us/library/aa378712.aspx
  18. 18.0 18.1 The COM Library: http://msdn.microsoft.com/en-us/library/ms682442.aspx Cite error: Invalid <ref> tag; name "msdn-com-library" defined multiple times with different content
  19. Description of the content of a class ID Windows registry key that identifies a COM class: http://msdn.microsoft.com/en-us/library/ms691424.aspx
  20. Description of the content of a interface ID Windows registry key that identifies a COM interface: http://msdn.microsoft.com/en-us/library/ms680091.aspx
  21. Native COM server self-registration: http://msdn.microsoft.com/en-us/library/ms694515.aspx
  22. DLL Surrogates: http://msdn.microsoft.com/en-us/library/ms695225.aspx
  23. Building COM Servers in .NET: http://www.codeproject.com/Articles/12579/Building-COM-Servers-in-NET
  24. An introduction to event sinks in C++ in the context of ATL COM Add Ins: http://www.codeproject.com/Articles/17832/Event-Sinks
  25. Events in COM and Connectable Objects: http://msdn.microsoft.com/en-us/library/windows/desktop/ms694379.aspx
  26. Event Sink Maps: http://msdn.microsoft.com/en-us/library/2kxtf8s3.aspx
  27. Supporting IDispEventImpl: http://msdn.microsoft.com/en-us/library/ttycc9bs.aspx
  28. COM Events in .NET: http://msdn.microsoft.com/en-us/library/1hee64c7.aspx
  29. 29.00 29.01 29.02 29.03 29.04 29.05 29.06 29.07 29.08 29.09 29.10 29.11 29.12 Processes, Threads, and Apartments: https://msdn.microsoft.com/en-us/library/ms693344.aspx
  30. 30.0 30.1 30.2 What are these "Threading Models" and why do I care?: http://blogs.msdn.com/b/larryosterman/archive/2004/04/28/122240.aspx
  31. 31.0 31.1 31.2 31.3 31.4 Threading in COM: http://en.wikipedia.org/wiki/Component_Object_Model#Threading_in_COM
  32. 32.0 32.1 32.2 32.3 Single-Threaded Apartments: https://msdn.microsoft.com/en-us/library/ms680112.aspx
  33. 33.0 33.1 33.2 In-Process Server Threading Issues: https://msdn.microsoft.com/en-us/library/ms687205.aspx
  34. 34.0 34.1 Registry key declaration for "InprocServer32": https://msdn.microsoft.com/en-us/library/windows/desktop/ms682390.aspx
  35. INFO: Descriptions and Workings of OLE Threading Models: https://support.microsoft.com/en-gb/help/150777/info-descriptions-and-workings-of-ole-threading-models