Static members of a class are not associated with any one particular instance/object of it. They can even be accessed without any instance. The way to refer to them in your code is as shown below:
[CODE]
class Test
{
private:
static const std::string className;
public:
static const std::string& getClassName()
{
return className;
}
//other members
};
int main()
{
std::cout << Test::getClassName();
}
It is also allowed to be able to access the static members via an instance/object as below: (modifying the above main function and keeping reset the same)
int main()
{
Test testObject;
std::cout << testObject.getClassName();
}
Usually though, it is better to use the previous notation using the scope resolution operator (op '::'). That makes the code more readable in the sense that you know it is a static member and not a non-static one. It speaks for itself.
Sunday, March 01, 2009
Accessing static members of a class
Posted by
abnegator
at
3/01/2009 02:23:00 PM
0
comments
Links to this post
Labels:
C++,
scope resolution operator,
static members
Saturday, January 24, 2009
Inline specializations and multiple inclusions
Recently, I came across a problem being faced by an individual on the forums. He had the code like this:
[CODE]
//file: header.h
#ifndef HEADER_H
#define HEADER_H
#include <iostream>
template<class T>
void foo(T val){
std::cout << "foo<T>("<<val<<")\n";
}
template<>
inline void foo(double val){
std::cout << "foo<double>(" << val <<")\n";
}
#endif
//file: main.cpp
#include "header.h"
int main(){
double x=4;
foo(x);
return 0;
}
//file: other.cpp
#include "header.h"
void somecode(){}
Just to summarize the above code, there is a header file (header.h) that contains the template functions and its specialization on type 'double' for template type parameter declared inline. And that header is being included into 2 implementation files (main.cpp & other.cpp)
This code compiled fine for him but as soon as he removed the inline keyword, it started to given him linker error for multiple definitions of the foo<double> function. We might start wondering what is it between templates and inline that might be causing this? Something and nothing. What happens is explained below:
The linker error seen is not specific to templates or specializations. One will face the same problem if they turned foo into a non-template function. C++ functions have external linkage hence they must be implemented in just one .cpp file (implementation file). They can, however, be declared multiple times. When we provide the implementation in the header file instead of an implementation file and that header is being included multiple implementation files (main.cpp and other.cpp), we have multiple definitions and hence, rightly the linkage error. Putting the implementation into the header file was what we were forced to do when using a template function because we need to put the implementation in the header file as well as the declaration since 'export' doesn't work except for just with the Comeau compiler. But specializations are a different case than regular templates. We don't face such issues with regular templates (not their specializations) because the instantiation rules guarantees that there is just one copy of the function generated as and when needed. While in case of specializations, we end up with multiple definitions because of them being included into multiple implementation files.
Now, with inline functions, we provide the implementation in the header file so as to be visible to the compilation point where they are used. They become inline in the final executable or not is irrelevant. You just follow the rule to define the inline functions. There will be only one copy of them.
Coming to the solutions to the above linker error, below are the two simplest ways:
1. Make the specialization as inline (as in original code, in case, someone facing this issue didn't have it in the first place)
2. Declare it not as a specialization but just as an overloaded foo function and declare it in the header (header.h) and provide implementation in an implementation file (for that matter any one implementation main.cpp/other.cpp or a .cpp of its own for header.h templates' specializations just making sure we don't do it twice).
References:
1. Codeguru thread where I came across this - "inline" and linking errors
2. C++ FAQ Lite on Inline Functions
[CODE]
//file: header.h
#ifndef HEADER_H
#define HEADER_H
#include <iostream>
template<class T>
void foo(T val){
std::cout << "foo<T>("<<val<<")\n";
}
template<>
inline void foo(double val){
std::cout << "foo<double>(" << val <<")\n";
}
#endif
//file: main.cpp
#include "header.h"
int main(){
double x=4;
foo(x);
return 0;
}
//file: other.cpp
#include "header.h"
void somecode(){}
Just to summarize the above code, there is a header file (header.h) that contains the template functions and its specialization on type 'double' for template type parameter declared inline. And that header is being included into 2 implementation files (main.cpp & other.cpp)
This code compiled fine for him but as soon as he removed the inline keyword, it started to given him linker error for multiple definitions of the foo<double> function. We might start wondering what is it between templates and inline that might be causing this? Something and nothing. What happens is explained below:
The linker error seen is not specific to templates or specializations. One will face the same problem if they turned foo into a non-template function. C++ functions have external linkage hence they must be implemented in just one .cpp file (implementation file). They can, however, be declared multiple times. When we provide the implementation in the header file instead of an implementation file and that header is being included multiple implementation files (main.cpp and other.cpp), we have multiple definitions and hence, rightly the linkage error. Putting the implementation into the header file was what we were forced to do when using a template function because we need to put the implementation in the header file as well as the declaration since 'export' doesn't work except for just with the Comeau compiler. But specializations are a different case than regular templates. We don't face such issues with regular templates (not their specializations) because the instantiation rules guarantees that there is just one copy of the function generated as and when needed. While in case of specializations, we end up with multiple definitions because of them being included into multiple implementation files.
Now, with inline functions, we provide the implementation in the header file so as to be visible to the compilation point where they are used. They become inline in the final executable or not is irrelevant. You just follow the rule to define the inline functions. There will be only one copy of them.
Coming to the solutions to the above linker error, below are the two simplest ways:
1. Make the specialization as inline (as in original code, in case, someone facing this issue didn't have it in the first place)
2. Declare it not as a specialization but just as an overloaded foo function and declare it in the header (header.h) and provide implementation in an implementation file (for that matter any one implementation main.cpp/other.cpp or a .cpp of its own for header.h templates' specializations just making sure we don't do it twice).
References:
1. Codeguru thread where I came across this - "inline" and linking errors
2. C++ FAQ Lite on Inline Functions
Posted by
abnegator
at
1/24/2009 10:48:00 PM
3
comments
Links to this post
Labels:
export,
inline,
linker,
multiple definition,
template specialization
Friday, August 15, 2008
Permutations in C++
A small simple sample that illustrates how to get the various permutations of the characters of a string in C++ using std::next_permutation provided under the standard include <algorithm>.
[code]
#include<algorithm>
#include<string>
#include<vector>
#include <iostream>
int main()
{
std::string input="ABC";
std::vector<std::string> perms;
perms.push_back(input);
std::string::iterator itBegin = input.begin();
std::string::iterator itEnd = input.end();
while(std::next_permutation(itBegin, itEnd))
{
perms.push_back(std::string(itBegin, itEnd));
}
std::copy(perms.begin(), perms.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
}
[/code]
[code]
#include<algorithm>
#include<string>
#include<vector>
#include <iostream>
int main()
{
std::string input="ABC";
std::vector<std::string> perms;
perms.push_back(input);
std::string::iterator itBegin = input.begin();
std::string::iterator itEnd = input.end();
while(std::next_permutation(itBegin, itEnd))
{
perms.push_back(std::string(itBegin, itEnd));
}
std::copy(perms.begin(), perms.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
}
[/code]
Posted by
abnegator
at
8/15/2008 10:04:00 PM
1 comments
Links to this post
Labels:
algorithm,
C++,
next_permutation,
ostream_iterator,
permutations,
std::copy,
string,
vector
Sunday, July 27, 2008
C++ unions : how to find currently active member
C++ unions can have member functions. And to surprise you even more, they can have constructors and a destructor. But that's not all that I intend to write about. A question straightaway comes into our mind: If unions can have member functions, how would you know which of its data member is currently 'active' or 'set' (well, this can come up even in case they don't have member functions, not very specific to them).
This is important to know because if say member1 of the union is set then that is the one that is active and you can only access its value. Accessing value of any of the other members is undefined behavior as per the standard and we all know what undefined behavior means; it may work in your case, on your machine, and it can fail you in front of a very potential client of yours or your managers when demo-ing your application.
Here is a way; you can let your member functions (is applicable to unions in general) know which data member is currently set:
[code]
union myunion
{
int member; //if == 1 then mem1 is set, use it; if ==2 mem2 is set, use it
struct
{
int struct_id; //1
//other members
} mem1;
struct
{
int struct_id; //2
//other members
} mem2;
//and so on...
void memfunc()
{
//could use a switch
if(member==1)
{
//use mem1
}
else if (member ==2)
{
//use mem2
}
//and so on...
}
};
[/code]
The in-lined comments are self explanatory. The member variable 'member' can be used as a signifier as to which of the other struct members is currently active, reading which is not unsafe. What makes it safe, you might wonder? To understand the guarantee that the programmer is provided with, let's go through what the C++ standard has to say about it. Quote: 18.5 [class.union]/1 provides the rules that suffice the above explanations. I will quote that paragraph in full below:
[quote]
In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time. [Note: one special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (9.2), and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of standard-layout struct members; see 9.2. —end note] The size of a union is sufficient to contain the largest of its data members. Each data member is allocated as if it were the sole member of a struct. A union can have member functions (including constructors and destructors), but not virtual (10.3) functions. A union shall not have base classes. A union shall not be used as a base class. An object of a non-trivial class (clause 9) shall not be a member of a union, nor shall an array of such objects. If a union contains a static data member or a member of reference type the program is ill-formed.
[/quote]
The above tells you that member functions are allowed with unions including constructors and destructors. It also makes a statement that structs inside a union that follow a common initial sequence, it is legal to inspect (i.e. query or read the value of) that common initial sequence. Also, it says that, each data member is allocated as if it were the sole member of a struct. In our example above, 'member' is such a data member. It is as if you have a struct as a member of the union with just a single data member of type 'int'.
Moreover, there is another mention in section 9.2/18 as follows:
[Quote]
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note]
[/quote]
Considering the above quotes, it is guaranteed that, each 'struct_id' (in the code sample above) shares the same layout as the first member of the union of int type "member". And hence using them (being the same initial sequence) in a union is perfectly safe, independent of which union member had been previously set (or is active).
This (the int member 'member') allows us to have a convenient way to see which data member of the union is currently active without peeping into any specific struct members (not that it is impossible) and without causing any effect on the size of the union.
Having member functions find out the active member that way is a little inextensible. What if you plan to add another struct member? You would need to enhance the switch statement, or the if-else-if conditionals to accommodate that. But then, even in case unions have no member functions, you might very well need to know that information and the above method helps.
This is important to know because if say member1 of the union is set then that is the one that is active and you can only access its value. Accessing value of any of the other members is undefined behavior as per the standard and we all know what undefined behavior means; it may work in your case, on your machine, and it can fail you in front of a very potential client of yours or your managers when demo-ing your application.
Here is a way; you can let your member functions (is applicable to unions in general) know which data member is currently set:
[code]
union myunion
{
int member; //if == 1 then mem1 is set, use it; if ==2 mem2 is set, use it
struct
{
int struct_id; //1
//other members
} mem1;
struct
{
int struct_id; //2
//other members
} mem2;
//and so on...
void memfunc()
{
//could use a switch
if(member==1)
{
//use mem1
}
else if (member ==2)
{
//use mem2
}
//and so on...
}
};
[/code]
The in-lined comments are self explanatory. The member variable 'member' can be used as a signifier as to which of the other struct members is currently active, reading which is not unsafe. What makes it safe, you might wonder? To understand the guarantee that the programmer is provided with, let's go through what the C++ standard has to say about it. Quote: 18.5 [class.union]/1 provides the rules that suffice the above explanations. I will quote that paragraph in full below:
[quote]
In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time. [Note: one special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (9.2), and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of standard-layout struct members; see 9.2. —end note] The size of a union is sufficient to contain the largest of its data members. Each data member is allocated as if it were the sole member of a struct. A union can have member functions (including constructors and destructors), but not virtual (10.3) functions. A union shall not have base classes. A union shall not be used as a base class. An object of a non-trivial class (clause 9) shall not be a member of a union, nor shall an array of such objects. If a union contains a static data member or a member of reference type the program is ill-formed.
[/quote]
The above tells you that member functions are allowed with unions including constructors and destructors. It also makes a statement that structs inside a union that follow a common initial sequence, it is legal to inspect (i.e. query or read the value of) that common initial sequence. Also, it says that, each data member is allocated as if it were the sole member of a struct. In our example above, 'member' is such a data member. It is as if you have a struct as a member of the union with just a single data member of type 'int'.
Moreover, there is another mention in section 9.2/18 as follows:
[Quote]
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note]
[/quote]
Considering the above quotes, it is guaranteed that, each 'struct_id' (in the code sample above) shares the same layout as the first member of the union of int type "member". And hence using them (being the same initial sequence) in a union is perfectly safe, independent of which union member had been previously set (or is active).
This (the int member 'member') allows us to have a convenient way to see which data member of the union is currently active without peeping into any specific struct members (not that it is impossible) and without causing any effect on the size of the union.
Having member functions find out the active member that way is a little inextensible. What if you plan to add another struct member? You would need to enhance the switch statement, or the if-else-if conditionals to accommodate that. But then, even in case unions have no member functions, you might very well need to know that information and the above method helps.
Saturday, May 03, 2008
comparing structs with memcmp
Can you use memcmp to compare C style structs reliably? Let's first see what the C-standards has to say about the function: memcmp.
From the C-standards, 7.21.4.1/The memcmp function:
[QUOTE]
Synopsis
1 #include <string.h>
int memcmp(const void *s1, const void *s2, size_t n);
Description
2 The memcmp function compares the first n characters of the object pointed to by s1 to the first n characters of the object pointed to by s2.(248))
Returns
3 The memcmp function returns an integer greater than, equal to, or less than zero, accordingly as the object pointed to by s1 is greater than, equal to, or less than the object pointed to by s2.
[QUOTE END]
The footnote (248) mentioned above is important and interesting:
[QUOTE]
The contents of ‘‘holes’’ used as padding for purposes of alignment within structure objects are indeterminate. <snipped>
[QUOTE END]
There's our trouble. Indeterminate values. There can be unused padding bytes with structures as need by alignment requirements for a platform and how they are filled is not defined by the standard. It is not even mentioned if they are going to be initialized or how they get initialized. You might get lucky with the indeterminate values but in the programming world, we always consider ourselves unlucky with something like that. So, unless you are very sure that there is no padding memcmp is rendered useless. You can find if your struct has padding or not as below :
[CODE]
if (sizeof(mystruct) == sizeof(member1) + sizeof(member2) /*+ ..so on...*/)
{
//no padding - probably can use memcmp - but are you really sure!!!???
}
else
{
//has definitely some padding - dont use memcmp
}
[CODE END]
Even though you find no padding, it is not safe to do memcmp in all cases. One reason that I can make out is because +0 and -0 for us would be same. They should evaluate equal but memcmp would not think the same way. To generalize it, whenever there are more than one bit representation of a particular value, memcmp may report false negatives, when the values might be equal but comparing their bit pattern comes unequal.
Below are few points of concern would prevent reliable use of memcmp to comparing 2 structs or even 2 basic fundamental datatypes or an array of those:
1. Unused bits : Not all bits for a type (except unsigned char) are used for its value representation. For example, on a platform where an int is 4 bytes and considering that for the platform, a byte has 8 bits, not all 32 bits be used to represent the value held by an int. So, on such platforms the largest possible value of an int may not be 2^(32-1) but lesser. How the unused bits are filled is implementation specific and hence memcmp would not be reliable to even compare the basic fundamental types like int even when the values are equal. The unused bits are called holes.
2. Padding bytes : Padding bytes within a struct between 2 members of the struct to cater to alignment requirements. The padded bytes will have indeterminate values that may compare equal or unequal in a non-deterministic way. So, memcmp would lead to false negatives for comparison. However, the positives returned would be reliable but that would be a smaller set out of the possible positives of the comparison.
3. Unreliable treatment of unused bits and padding bytes after memset : memset on the types or structs can probably work around the uninitialized padding bytes/holes indeterminate values (as it treats the data as a sequence of unsigned chars which have no holes) but there can be other reasons to failure and who knows if the values of those padding bytes might change during the process of the program between a memset call and a memcmp call. If the standards explicitly prohibits an implementation from doing so, please let me know and I shall correct myself. The relevant quote would be fantastic!
4. Floating types : floating point numbers cannot even be compared reliably using ==, forget about memcmp. They always need to be compared to be a around each other but varying by some value of epsilon() as defined by the compiler.
5. Pointer representation : Pointer's bit representations can be different for the same linear address and two completely different pointers may compare equal. Of particular peculiarity is the segment:offset representation where two different segment::offset pair values stored in a pointer's representation may actually by referring to the same linear address. Linear address here = segment*16 +offset. You will get 2 different pairs of segment and offset for which the equation would evaluate to the same linear address. The inbuilt == operator is guaranteed to work such that two pointers of the same type pointing to the same object will always compare equal. This is also considering you are just wanting to compare the pointers and not the values pointed to by them.
Considering the above cases when memcmp would be unreliable, I would choose member wise comparison to be the solution to choose to compare 2 structs of same type. If the structs are different types with comparable members (but could be of different types), memcmp would probably fail more miserably. If it is a non-POD struct, it is simply undefined behaviour.
It is worth going through the following discussion on comp.lang.c that discuss the above in greater detail : Expert-Q: (a!=b) != memcmp(&a, &b, sizeof a) ?.
From the C-standards, 7.21.4.1/The memcmp function:
[QUOTE]
Synopsis
1 #include <string.h>
int memcmp(const void *s1, const void *s2, size_t n);
Description
2 The memcmp function compares the first n characters of the object pointed to by s1 to the first n characters of the object pointed to by s2.(248))
Returns
3 The memcmp function returns an integer greater than, equal to, or less than zero, accordingly as the object pointed to by s1 is greater than, equal to, or less than the object pointed to by s2.
[QUOTE END]
The footnote (248) mentioned above is important and interesting:
[QUOTE]
The contents of ‘‘holes’’ used as padding for purposes of alignment within structure objects are indeterminate. <snipped>
[QUOTE END]
There's our trouble. Indeterminate values. There can be unused padding bytes with structures as need by alignment requirements for a platform and how they are filled is not defined by the standard. It is not even mentioned if they are going to be initialized or how they get initialized. You might get lucky with the indeterminate values but in the programming world, we always consider ourselves unlucky with something like that. So, unless you are very sure that there is no padding memcmp is rendered useless. You can find if your struct has padding or not as below :
[CODE]
if (sizeof(mystruct) == sizeof(member1) + sizeof(member2) /*+ ..so on...*/)
{
//no padding - probably can use memcmp - but are you really sure!!!???
}
else
{
//has definitely some padding - dont use memcmp
}
[CODE END]
Even though you find no padding, it is not safe to do memcmp in all cases. One reason that I can make out is because +0 and -0 for us would be same. They should evaluate equal but memcmp would not think the same way. To generalize it, whenever there are more than one bit representation of a particular value, memcmp may report false negatives, when the values might be equal but comparing their bit pattern comes unequal.
Below are few points of concern would prevent reliable use of memcmp to comparing 2 structs or even 2 basic fundamental datatypes or an array of those:
1. Unused bits : Not all bits for a type (except unsigned char) are used for its value representation. For example, on a platform where an int is 4 bytes and considering that for the platform, a byte has 8 bits, not all 32 bits be used to represent the value held by an int. So, on such platforms the largest possible value of an int may not be 2^(32-1) but lesser. How the unused bits are filled is implementation specific and hence memcmp would not be reliable to even compare the basic fundamental types like int even when the values are equal. The unused bits are called holes.
2. Padding bytes : Padding bytes within a struct between 2 members of the struct to cater to alignment requirements. The padded bytes will have indeterminate values that may compare equal or unequal in a non-deterministic way. So, memcmp would lead to false negatives for comparison. However, the positives returned would be reliable but that would be a smaller set out of the possible positives of the comparison.
3. Unreliable treatment of unused bits and padding bytes after memset : memset on the types or structs can probably work around the uninitialized padding bytes/holes indeterminate values (as it treats the data as a sequence of unsigned chars which have no holes) but there can be other reasons to failure and who knows if the values of those padding bytes might change during the process of the program between a memset call and a memcmp call. If the standards explicitly prohibits an implementation from doing so, please let me know and I shall correct myself. The relevant quote would be fantastic!
4. Floating types : floating point numbers cannot even be compared reliably using ==, forget about memcmp. They always need to be compared to be a around each other but varying by some value of epsilon() as defined by the compiler.
5. Pointer representation : Pointer's bit representations can be different for the same linear address and two completely different pointers may compare equal. Of particular peculiarity is the segment:offset representation where two different segment::offset pair values stored in a pointer's representation may actually by referring to the same linear address. Linear address here = segment*16 +offset. You will get 2 different pairs of segment and offset for which the equation would evaluate to the same linear address. The inbuilt == operator is guaranteed to work such that two pointers of the same type pointing to the same object will always compare equal. This is also considering you are just wanting to compare the pointers and not the values pointed to by them.
Considering the above cases when memcmp would be unreliable, I would choose member wise comparison to be the solution to choose to compare 2 structs of same type. If the structs are different types with comparable members (but could be of different types), memcmp would probably fail more miserably. If it is a non-POD struct, it is simply undefined behaviour.
It is worth going through the following discussion on comp.lang.c that discuss the above in greater detail : Expert-Q: (a!=b) != memcmp(&a, &b, sizeof a) ?.
Posted by
abnegator
at
5/03/2008 05:31:00 PM
2
comments
Links to this post
Labels:
C,
memcmp,
memset,
padding,
sizeof,
structs
Subscribe to:
Posts (Atom)
