[Home]
[Search]
[D]
Last update Feb 8, 2003
Overview
What is D?
D is a general purpose systems and applications programming language.
It is a higher level language than C++, but retains the ability
to write high performance code and interface directly with the
operating system
API's
and with hardware.
D is well suited to writing medium to large scale
million line programs with teams of developers. D is easy
to learn, provides many capabilities to aid the programmer,
and is well suited to aggressive compiler optimization technology.
D is not a scripting language, nor an interpreted language. It doesn't
come with a VM,
a religion, or an overriding
philosophy. It's a practical language for practical programmers
who need to get the job done quickly, reliably, and leave behind
maintainable, easy to understand code.
D is the culmination of decades of experience implementing
compilers for many diverse languages, and attempting to construct
large projects using those languages. D draws inspiration from
those other languages (most especially C++) and tempers it with
experience and real world practicality.
Why D?
Why, indeed. Who needs another programming language?
The software industry has come a long way since the C language was
invented.
Many new concepts were added to the language with C++, but backwards
compatibility with C was maintained, including compatibility with
nearly all the weaknesses of the original design.
There have been many attempts to fix those weaknesses, but the
compatibility issue frustrates it.
Meanwhile, both C and C++ undergo a constant accretion of new
features. These new features must be carefully fitted into the
existing structure without requiring rewriting old code.
The end result is very complicated - the C standard is nearly
500 pages, and the C++ standard is about 750 pages!
The reality of the C++ compiler business is that few compilers
effectively implement the entire standard.
C++ programmers tend to program in particular islands of the language,
i.e. getting very proficient using certain features while avoiding
other feature sets. While the code is portable from compiler
to compiler, it can be hard to port it from programmer to programmer.
A great strength of C++ is that it can support many radically
different styles of programming - but in long term use, the
overlapping and contradictory styles are a hindrance.
It's frustrating that such a powerful language
does not do basic things like resizing arrays and concatenating
strings.
Yes, C++ does provide the meta programming ability to implement
resizable arrays and strings like the vector type in the
STL.
Such
fundamental features,
however, ought to be part of the language.
Can the power and capability of C++ be extracted, redesigned,
and recast into a language that is simple, orthogonal,
and practical?
Can it all be put into a package
that is easy for compiler
writers to correctly implement, and
which enables compilers to efficiently generate aggressively
optimized code?
Modern compiler technology has progressed to the point where language
features for the purpose of compensating for primitive compiler
technology can be omitted. (An
example of this would be the 'register' keyword in C, a more
subtle example is the macro
preprocessor in C.)
We can rely on modern compiler optimization technology to not
need language features necessary to get acceptable code quality out of
primitive compilers.
D aims to reduce software development costs by at least 10% by adding
in proven
productivity enhancing features and by adjusting language features so that
common, time-consuming bugs are eliminated from the start.
Features To Keep From C/C++
The general look of D is like C and C++. This makes it easier to learn
and port code to D. Transitioning from C/C++ to D should feel natural, the
programmer will not have to learn an entirely new way of doing things.
Using D will not mean that the programmer will become restricted to a
specialized runtime vm (virtual machine) like the Java vm or the Smalltalk vm.
There is no D vm, it's a straightforward compiler that generates linkable object
files. D connects to the operating system just like C does.
The usual familiar tools like make will fit right in with D development.
- The compile/link/debug development model will be
carried forward,
although nothing precludes D from being compiled into bytecode
and interpreted.
- Exception handling.
More and more experience with exception handling shows it to be a
superior way to handle errors than the C traditional method of using error codes and errno
globals.
- Runtime Type Identification.
This is partially implemented in C++;
in D it is taken to its
next logical step. Fully supporting it enables better garbage
collection, better debugger support, more automated persistence, etc.
- D maintains function link compatibility with the C calling
conventions. This makes
it possible for D programs to access operating system API's directly.
Programmers' knowledge and experience with existing programming API's
and paradigms can be carried forward to D with minimal effort.
- Operator overloading.
D programs can overload operators enabling
extension of the basic types with user defined types.
- Templates.
Templates are a way to implement generic programming.
Other ways are using macros, or having a variant data type. Using
macros is out. Variants are straightforward, but the loss of type
checking is a problem. The difficulties with C++ templates are their
complexity, they don't fit well into the syntax of the language,
all the various rules for conversions and overloading fitted on top of
it, etc. D offers a much simpler way of doing templates.
- RAII
(Resource Acquisition Is Initialization).
RAII techniques are an essential component of writing reliable
software.
- Down and dirty programming. D will retain the ability to
do down-and-dirty programming without resorting to referring to
external modules compiled in a different language. Sometimes,
it's just necessary to coerce a pointer or dip into assembly
when doing systems work. D's goal is not to prevent down
and dirty programming, but to minimize the need for it in
solving routine coding tasks.
Features To Drop
- C source code compatibility. Extensions to C that maintain
source compatiblity
have already been done (C++ and ObjectiveC). Further work in this
area is hampered by so much legacy code it is unlikely that significant
improvements can be made.
- Link compatibility with C++. The C++ runtime object model is just
too complicated - properly supporting it would essentially imply
making D a full C++ compiler too.
- The C preprocessor. Macro processing is an easy way to extend
a language, adding in faux features that aren't really there (invisible
to the symbolic debugger). Conditional compilation, layered with
#include text, macros, token concatenation, etc., essentially forms
not one language but two merged together with no obvious distinction
between them. Even worse (or perhaps for the best) the C preprocessor
is a very primitive macro language. It's time to step back, look at
what the preprocessor is used for, and design support for those
capabilities directly into the language.
- Multiple inheritance. It's a complex
feature of debatable value. It's very difficult to implement in an
efficient manner, and compilers are prone to many bugs in implementing
it. Nearly all the value of
MI can be handled with single inheritance
coupled with interfaces and aggregation. What's left does not
justify the weight of MI implementation.
- Namespaces. An attempt to deal with the problems resulting from
linking together independently developed pieces of code that
have conflicting names. The idea of modules is simpler and works
much better.
- Tag name space. This misfeature of C is where the tag names
of struct's are in a separate but parallel symbol table. C++
attempted to merge the tag name space with the regular name space,
while retaining backward compatibility with legacy C code. The
result is something better off chucked.
- Forward declarations. C compilers semantically only know
about what has lexically preceded the current state. C++ extends this
a little, in that class members can rely on forward referenced class
members. D takes this to its logical conclusion, forward declarations
are no longer necessary at all. Functions can be defined in a natural
order rather than the typical inside-out order commonly used in C
programs to avoid writing forward declarations.
- Include files. A major cause of slow compiles as each
compilation unit
must reparse enormous quantities of header files. Include files
should be done as importing a symbol table.
- Creating object instances on the stack. In D, all class objects
are by reference. This eliminates the need for copy constructors,
assignment operators, complex destructor
semantics, and interactions with exception handling stack unwinding.
Memory resources get freed by the garbage collector, other resources
are freed by try-finally blocks.
- Trigraphs and digraphs. Wide char is the modern solution to
international character sets.
- Preprocessor. Modern languages should not be text processing,
they should be symbolic processing.
- Non-virtual member functions. In C++, a class designer decides
in advance if a function is to be virtual or not. Forgetting to retrofit
the base class member function to be virtual when the function gets
overridden is a common (and very hard to find) coding error.
Making all member functions virtual, and letting the compiler decide
if there are no overrides and hence can be converted to non-virtual,
is much more reliable.
- Bit fields of arbitrary size.
Bit fields are a complex, inefficient feature rarely used.
- Support for 16 bit computers.
No consideration is given in D for mixed near/far pointers and all the
machinations necessary to generate good 16 bit code. The D language
design assumes at least a 32 bit flat memory space. D will fit smoothly
into 64 bit architectures.
- Mutual dependence of compiler passes. In C++, successfully parsing
the source text relies on having a symbol table, and on the various
preprocessor commands. This makes it
impossible to preparse C++ source, and makes writing code analyzers
and syntax directed editors painfully difficult to do correctly.
- Compiler complexity. Reducing the complexity of an implementation
makes it more likely that multiple, correct implementations
are available.
- Distinction between . and ->. This distinction
is really not necessary. The . operator serves just as well for
pointer dereferencing.
Who D is For
- Programmers who routinely use lint or similar code analysis tools
to eliminate bugs before the code is even compiled.
- People who compile with maximum warning levels turned on and who
instruct the compiler to treat warnings as errors.
- Programming managers who are forced to rely on programming style
guidelines to avoid common C bugs.
- Those who decide the promise of C++ object oriented
programming is not fulfilled due to the complexity of it.
- Programmers who enjoy the expressive power of C++ but are
frustrated by
the need to expend much effort explicitly managing memory and finding
pointer bugs.
- Projects that need built-in testing and verification.
- Teams who write apps with a million lines of code in it.
- Programmers who think the language should provide enough
features to obviate
the continual necessity to manipulate pointers directly.
- Numerical programmers. D has many features to directly
support features needed by numerics programmers, like direct support
for the complex data type and
defined behavior for
NaN's and infinities.
(These are added in the new
C99 standard, but not in C++.)
- D's lexical analyzer and parser are totally independent of each other and of the
semantic analyzer. This means it is easy to write simple tools to manipulate D source
perfectly without having to build a full compiler. It also means that source code can be
transmitted in tokenized form for specialized applications.
Who D is Not For
- Realistically, nobody is going to convert million line C or C++
programs into D, and since D does not compile unmodified C/C++
source code, D is not for
legacy apps. (However, D supports legacy C API's very well.)
- Very small programs - a scripting or interpreted language like
Python,
DMDScript,
or Perl is likely more suitable.
- As a first programming language - Basic or Java is more suitable
for beginners. D makes an excellent second language for intermediate
to advanced programmers.
Major Features of D
This section lists some of the more interesting features of D
that set it apart from C and C++.
Garbage Collection
D memory allocation is fully garbage collected. Empirical experience
suggests that a lot of the complicated features of C++ are necessary
in order to manage memory deallocation. With garbage collection, the
language gets much simpler.
There's a perception that garbage collection is for lazy, junior
programmers. I remember when that was said about C++, after all,
there's nothing in C++ that cannot be done in C, or in assembler
for that matter.
Garbage collection eliminates the tedious, error prone memory allocation
tracking code necessary in C and C++. This not only means much
faster development time and lower maintenance costs,
but the resulting program frequently runs
faster!
Sure, garbage collectors can be used with C++, and I've used them
in my own C++ projects. The language isn't friendly to collectors,
however, impeding the effectiveness of it. Much of the runtime
library code can't be used with
collectors.
For a fuller discussion of this, see garbage
collection.
Contracts
Design by Contract (invented by B. Meyer) is a revolutionary technique
to aid in ensuring the correctness of programs. D's version of
DBC includes function preconditions, function postconditions, class
invariants, and assert contracts.
See Contracts for D's implementation.
Declaration vs Definition
C++ usually requires that functions and classes be declared twice - the declaration
that goes in the .h header file, and the definition that goes in the .c source
file. This is an error prone and tedious process. Obviously, the programmer
should only need to write it once, and the compiler should then extract the
declaration information and make it available for symbolic importing. This is
exactly how D works.
Example:
class ABC
{
int func() { return 7; }
static int z = 7;
}
int q;
There is no longer a need for a separate definition of member functions, static
members, externs, nor for clumsy syntaxes like:
int ABC::func() { return 7; }
int ABC::z = 7;
extern int q;
Note: Of course, in C++, trivial functions like { return 7; }
are written inline too, but complex ones are not. In addition, if
there are any forward references, the functions need to be prototyped.
The following will not work in C++:
class Foo
{
int foo(Bar *c) { return c->bar; }
};
class Bar
{
public:
int bar() { return 3; }
};
But the equivalent D code will work:
class Foo
{
int foo(Bar c) { return c.bar; }
}
class Bar
{
int bar() { return 3; }
}
Whether a D function is inlined or not is determined by the
optimizer settings.
Modules
Source files have a one-to-one correspondence with modules.
Instead of #include'ing the text of a file of declarations,
just import the module. There is no need to worry about
multiple imports of the same module, no need to wrapper header
files with #ifndef/#endif or #pragma once kludges,
etc.
Real Typedefs
C and C++ typedefs are really type aliases, as no new
type is really introduced. D implements real typedefs, where:
typedef int handle;
really does create a new type handle. Type checking is
enforced, and typedefs participate in function overloading.
For example:
int foo(int i);
int foo(handle h);
Arrays
C arrays have several faults that can be corrected:
D arrays come in 4 varieties: pointers, static arrays, dynamic
arrays, and associative arrays.
See Arrays.
Strings
String manipulation is so common, and so clumsy in C and C++, that
it needs direct support in the language. Modern languages handle
string concatenation, copying, etc., and so does D. Strings are
a direct consequence of improved array handling.
Bit type
The fundamental data type is the bit, and D has a bit data
type. This is most useful in creating arrays of bits:
bit[] foo;
Associative Arrays
Associative arrays are arrays with an arbitrary data type as
the index rather than being limited to an integer index.
In essence, associated arrays are hash tables. Associative
arrays make it easy to build fast, efficient, bug-free symbol
tables.
Debug Attributes and Statements
Now debug is part of the syntax of the language.
The code can be enabled or disabled at compile time, without the
use of macros or preprocessing commands. The debug syntax enables
a consistent, portable, and understandable recognition that real
source code needs to be able to generate both debug compilations and
release compilations.
Synchronization
Multithreaded programming is becoming more and more mainstream,
and D provides primitives to build multithreaded programs with.
Synchronization can be done at either the method or the object level.
synchronize int func() { . }
Synchronized functions allow only one thread at a time to be executing that
function.
The synchronize statement puts a mutex around a block of statements,
controlling access either by object or globally.
Lightweight Aggregates
D supports simple C style struct's, both for compatibility with
C data structures and because they're useful when the full power
of classes is overkill.
Classes
D's object oriented nature comes from classes.
The inheritance model is single inheritance enhanced
with interfaces. The class Object sits at the root
of the inheritance heirarchy, so all classes implement
a common set of functionality.
Classes are instantiated
by reference, and so complex code to clean up after exceptions
is not required.
Exception Handling
The superior try-catch-finally model is used rather than just
try-catch. There's no need to create dummy objects just to have
the destructor implement the finally semantics.
Which of "release resource using destructor" (C++) or "release resource
using finally" (D) is better is a topic for much debate, but
I obviously am in the latter camp.
In, Out, and Inout Parameters
Not only does specifying this help make functions more
self-documenting, it eliminates much of the necessity for pointers
without sacrificing anything, and it opens up possibilities
for more compiler help in finding coding problems.
Such makes it possible for D to directly interface to a
wider variety of foreign API's. There would be no need for
workarounds like "Interface Definition Languages".
Direct Access to C API's
Not only does D have data types that correspond to C types,
it provides direct access to C functions. There is no need
to write wrapper functions, parameter swizzlers, nor code to copy
aggregate members one by one.
Inline Assembler
Device drivers, high performance system applications, embedded systems,
and specialized code sometimes need to dip into assembly language
to get the job done. While D implementations are not required
to implement the inline assembler, it is defined and part of the
language. Most assembly code needs can be handled with it,
obviating the need for separate assemblers or DLL's.
Many D implementations will also support intrinsic functions
analogously to C's support of intrinsics for I/O port manipulation,
direct access to special floating point operations, etc.
Versioning
D provides built-in support for generation of multiple versions
of a program from the same text. It replaces the C preprocessor
#if/#endif technique.
D provides the capability of marking
declarations as deprecated which makes it easier to verify
that obsolete API's are not being inadvertantly used.
Unit Tests
Unit tests can be added to a class, such that they are automatically
run upon program startup. This aids in verifying, in every build,
that class implementations weren't inadvertantly broken. The unit
tests form part of the source code for a class. Creating them
becomes a natural part of the class development process, as opposed
to throwing the finished code over the wall to the testing group.
Unit tests can be done in other languages, but the result is kludgy
and the languages just aren't accommodating of the concept.
Unit testing is a main feature of D. For library functions it works
out great, serving both to guarantee that the functions
actually work and to illustrate how to use the functions.
Consider the many C++ library and application code bases out there for
download on the web. How much of it comes with *any* verification
tests at all, let alone unit testing? Less than 1%? The usual practice
is if it compiles, we assume it works. And we wonder if the warnings
the compiler spits out in the process are real bugs or just nattering
about nits.
Along with design by contract, unit testing makes D far and away
the best language for writing reliable, robust systems applications.
Unit testing also gives us a quick-and-dirty estimate of the quality
of some unknown piece of D code dropped in our laps - if it has no
unit tests and no contracts, it's unacceptable.
Operator Overloading
Classes can be crafted that work with existing operators to extend
the type system to support new types. An example would be creating
a bignumber class and then overloading the +, -, * and / operators
to enable using ordinary algebraic syntax with them.
Templates
D templates offer a clean way to support generic programming while
offering the power of partial specialization.
RAII
RAII is a modern software development technique to manage resource
allocation and deallocation. D supports RAII in a controlled,
predictable manner that is independent of the garbage collection
cycle.
Debug Support
A modern language should do all it can to help the programmer flush out bugs in the
code. Help can come in many forms; from making it easy to use more robust techniques,
to compiler flagging of obviously incorrect code, to runtime checking.
Support for Robust Techniques
- Dynamic arrays instead of pointers
- Reference variables instead of pointers
- Reference objects instead of pointers
- Garbage collection instead of explicit memory management
- Built-in primitives for thread synchronization
- No macros to inadvertently slam code
- Inline functions instead of macros
- Vastly reduced need for pointers
- Integral type sizes are explicit
- No more uncertainty about the signed-ness of chars
- No need to duplicate declarations in source and header files.
- Explicit parsing support for adding in debug code.
Compile Time Checks
- Stronger type checking
- Explicit initialization required
- Unused local variables not allowed
- No empty ; for loop bodies
- Assignments do not yield boolean results
- Deprecating of obsolete API's
Runtime Checking
- assert() expressions
- array bounds checking
- undefined case in switch exception
- out of memory exception
- In, out, and class invariant design by contract support
Unit Tests
- organized on a per class basis
- testing code is part of the source code
- unit tests can be turned on or off by compiler switch
Sample D Program (sieve.d)
/* Sieve of Eratosthenes prime numbers */
import stdio;
bit[8191] flags;
int main()
{ int i, count, prime, k, iter;
printf("10 iterations\n");
for (iter = 1; iter <= 10; iter++)
{ count = 0;
flags[] = 1;
for (i = 0; i < flags.length; i++)
{ if (flags[i])
{ prime = i + i + 3;
k = i + prime;
while (k < flags.length)
{
flags[k] = 0;
k += prime;
}
count += 1;
}
}
}
printf ("\n%d primes", count);
return 0;
}
Copyright (c) 1999-2003 by Digital Mars, All Rights Reserved