Handling Complexity
Contents
Handling Complexity:
An Object is an Encapsulation of Data and Methods
Here I will discuss data abstraction and modularization, two complementary methodologies that have been developed to keep programming a bit simpler, which together form the concept of encapsulation.
Data Abstraction
If you’ve ever written a large program based on a certain type of data, then you know what a pain it is to have to later alter the program to handle a different kind of data. That little change can mean having to rewrite the entire program with a whole new set of functions to handle the new data type.
But suppose you split up your original solution. First, you have some code that describes your data and its basic operations. Then you have your main code, which actually solves the problem, with any generic kind of data. With this kind of design, all you have to do to change the data type is to write code that describes the new data type; you’ll need to make few, if any, modifications to your main program.
Sorting algorithm
for each line of data in the input file
read the line of data
if there is no data in the output file
store the data in the output file
else
while the new data should come after the current output data
check the next data item in the output file
end while
insert the new data after the current data
end if-else
end for
Numeric-type data abstraction
- integer: value
- integer-compare-function()
String-type data abstraction
- string: value
- string-compare-function()
In the above pseudocode example, the sorting algorithm works no matter what kind of data you’re sorting. This principle of separating data from the code that manipulates it is called data abstraction, because the data is defined only abstractly in the main solution. It allows the programmer to focus on algorithms that handle generic data. Then the same program code can be used to manipulate many different kinds of data, as long as that data is separately defined with its simple functions that directly handle it.
An object is a self-contained entity that contains both data and the functions to manipulate that data. In object-oriented jargon, functions are called methods. The data is an essential part of the object because it describes the object’s state. (You could kind of think of the data as an object’s variables, except that OOP purists would castigate you for carrying over this archaic construct from static, iterative programming languages.) The methods are equally important, because they tell what the object can do with its data. (OOP methods are equivalent to subroutines, functions, and procedures in other languages.) An object’s data and methods are collectively called its members. The members of an object form a data abstraction.
Thus an object-oriented programmer can define objects that know what kind of data they have and know how to handle them. So when programmers write their main solution code, they don’t have to worry about the data type. And if the kind of data that the program needs should later change (as it always does), all a programmer needs to do is create new objects, and make minimal changes to the main program code.
Modularization
Modularization means breaking up a program into smaller sections, or modules, each of which does a complete little task. An example of modular design is the top-down approach to solving problems: Break the entire problem into several smaller problems, which are in turn broken down to smaller problems. Do this until each sub-problem is small enough to be easily understood, solved in detail (coded), and debugged. The modular approach helps developers to understand and solve their problems better. It also allows the sub-problems to be independently tested and maintained without affecting or being affected by the performance or bugginess of the other modules.
Encapsulation: A Modular Data Abstraction
When combined, data abstraction and modularization together form the powerful concept of encapsulation: a module is used to implement a data abstraction. The module defines the structure and type of the data, and the methods that manipulate it. The methods serve as an interface to allow any outside program modules to access the actual data in the encapsulated module, so that the outside modules don’t have to worry about how to handle the specific kind of data. This lets the programmer focus on narrower problems and allows for simpler coding.
In the sorting example above, we have three separate modules: the main program code, the numeric-type data abstraction, and the string-type data abstraction. As you might have guessed, the two modular data abstractions would be implemented as objects in an OOPL, since they are self-contained units. What might not be so obvious is that some OOPLs, like Smalltalk and Java, would also implement the main solution algorithm as an object. As these languages see it, the solution has its own complete behavior and data.
Note that encapsulation is not unique to OOP. What is unique is that OOP uses objects to implement encapsulation, and its entire programming philosophy revolves around these modular data abstractions.
Information hiding: An important issue in encapsulation is whether or not an object should let other objects access its data. If an object let just anyone change its data, things could easily get out of hand. And sometimes an object would prefer to tell another object about itself only when asked, rather than letting just anyone snoop around inside. It’s not just a matter of being secretive, but if a programmer doesn’t keep strict control over which object accesses what, the program could easily get out of hand. Thus most OOPLs hide the information in their objects to some degree. Smalltalk, for example, doesn’t allow any outside objects to access data members. C++ and Java, on the other hand, let programmers specify which outside objects can access an object’s data, and how much they can access. (Cook 1990)
Now that we understand that an object is an encapsulation of data and methods, let’s explore their other property—behavior sharing—that defines OOP.