homeGeek CultureWebstoreeCards!Forums!Joy of Tech!AY2K!webcam

The Geek Culture Forums


  New Poll  
my profile | directory login | | search | faq | forum home
  next oldest topic   next newest topic
» The Geek Culture Forums   » The Archives   » The Big Archives   » Binary output in C++

 - UBBFriend: Email this page to someone!    
Author Topic: Binary output in C++
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 15, 2003 22:42      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
I'm writting a MIPS assembler in C++ for a class (Data Structures and Algorithms). The only problem that I can't seem to overcome is writting binary output...

I am familiar with passing longs to the write() function (using sizeof()), but I have to write strange quantities (like a 6-bit optCode, and 5 bit references to registers). Any suggestions on how to do the output?

--------------------
My Site

Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged
The Famous Druid

Gold Hearted SuperFan!
Member # 1769

Member Rated:
4
Icon 1 posted October 15, 2003 22:54      Profile for The Famous Druid     Send New Private Message       Edit/Delete Post 
quote:
Originally posted by GameMaster:
I'm writting a MIPS assembler in C++ for a class (Data Structures and Algorithms). The only problem that I can't seem to overcome is writting binary output...

I am familiar with passing longs to the write() function (using sizeof()), but I have to write strange quantities (like a 6-bit optCode, and 5 bit references to registers). Any suggestions on how to do the output?

If the format is fixed, you could do it using bitfields.

If it's variable format, you'll probably have to bit-bang it yourselfm i.e. use bitwise or and shift instructions to build the final result

--------------------
If you watch 'The History Of NASA' backwards, it's about a space agency that has no manned spaceflight capability, then does low-orbit flights, then lands on the Moon.

Posts: 10668 | From: Melbourne, Australia | Registered: Oct 2002  |  IP: Logged
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 15, 2003 23:50      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
Well there are 3 different instruction formats... opt code is always a certian length, extended opt code is always a certian length (but is only in Type R), the imm is a certian Length (only in one type as well) and adresses are a certian length. I was thinking of shifting bits and adding, shiting bits and adding, shiting bits and adding... but that doesn't seem to be the best way to get through it.

What is a bit-field? You mean like a constant mutiplier or some form binary object (a field of bools)? I'm afriad I haven't done any binary output before (except in the MIPS assembly language for the class in assembly and orginization).

--------------------
My Site

Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged
The Famous Druid

Gold Hearted SuperFan!
Member # 1769

Member Rated:
4
Icon 1 posted October 16, 2003 00:12      Profile for The Famous Druid     Send New Private Message       Edit/Delete Post 
typedef struct
{
unsigned int singleBit:1;
int fiveBits:5;
int tenBits:10;
} myBitField;

myBitField bits;

declares a 16 bit variable 'bits' which can be accessed as 3 fields of 1,5,10 bits respectively.

Note: some compilers may pad the struct to a multiple of 32 bits, so you should use sizeof() to confirm it's the size you think it is.

Posts: 10668 | From: Melbourne, Australia | Registered: Oct 2002  |  IP: Logged
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 16, 2003 06:56      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
BRILLIANT!!!! I already have classes that look like:
code:
class Word : public Node
{
public:
virtual void write(ostream & sout);
virtual ~Word(){}
};

class TypeI : public Word
{
public:
TypeI(int op, int s, int t, int i) : opt(op),
rs(s), rt(t), imm(i){}
virtual ~TypeI(){}
virtual void write(ostream & sout);
int opt;
int rs;
int rt;
int imm;
const static int optSize = 6;
const static int rsSize = 5;
const static int rtSize = 5;
const static int immSize = 11;
};

class TypeR : public Word
{
public:
TypeR(int op, int s, int d, int t, int sh,
int exopt) : opt(op), rs(s), rt(t),
rd(d), shamt(sh), funct(exopt){}
virtual ~TypeR(){}
virtual void write(ostream & sout);
int opt;
int rs;
int rt;
int rd;
int shamt;
int funct;
const static int optSize = 6;
const static int rsSize = 5;
const static int rtSize = 5;
const static int rdSize = 5;
const static int shamtSize = 5;
const static int functSize = 6;
};

So, then if I rewrite the the int declarations I can cast the whole object as a char* and use print with sizeof() (the object will be 32 bits long, always.

That also helps me solve another problem I was going to run into... *bows graciously* The functions in the classes aren't encoded in the bits in anyway, just the data right... or will I have to have wrap a bit field with functionality in the form of an outter class... Ah, I'll play with it and see what I get.

Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged
quantumfluff
BlabberMouth, a Blabber Odyssey
Member # 450

Member Rated:
5
Icon 1 posted October 16, 2003 07:42      Profile for quantumfluff     Send New Private Message       Edit/Delete Post 
It isn't so simple. If you want to do this right, rather than just working by luck, you should not rely on bit fields. The alignment within a word is implementation defined, so you can't be sure you get what you intended. Also, you can't just write out a 16 or 32 bit quantity and be portable. That is
code:
  long foo = 0x01;

write(fd, (const char *)&foo, sizeof(foo));

is 0x01, 0x00, 0x00, 0x00 on some machines, and
0x00, 0x00, 0x00, 0x01 on others. (And if you have a VAX, I think it becomes 0x00, 0x01, 0x00, 0x00, but I'm not sure). Intel architectures put the 1 first. Sparc puts it at the end. Mips can go either way, depending on the rest of the hardware!

What I would do is create macros for the three different opcode types which take the parameters and shift and or them together to build a 32 bit quantity. That can be assigned to a long. Then
have a routine writes the opcode. It should select out the bytes in the order you need them for and write them individually. (So the performance is not too egregiously bad, you should be using a buffered output stream).

Posts: 2901 | From: 5 to 15 meters above sea level | Registered: Jun 2000  |  IP: Logged
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 16, 2003 07:57      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
Platform matters not, it doesn't need to be portable... It must run on sparc (the assembled code will never be run or used, just diffed against the expected output). In my excitement I made the changes to make the ints the proper size, and then wrote the object, and I get too many words, and the wrong value - the non-encoded value is correct... I suppose there is no way to avoid bit shifting... :/

--------------------
My Site

Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged
quantumfluff
BlabberMouth, a Blabber Odyssey
Member # 450

Member Rated:
5
Icon 1 posted October 16, 2003 11:16      Profile for quantumfluff     Send New Private Message       Edit/Delete Post 
FYI: SPARC is big-endian, so that casting the address of a long to (char *) and looking at that will show the high byte first. A 1 will be three bytes of zeros and finally a 1. The spec for the mips object file will define how it wants it's numbers to come in.

Even if there's no way to avoid the shifting, you can ecapsulate it in macros of functions. You should never see it sprinkled in the code.

Posts: 2901 | From: 5 to 15 meters above sea level | Registered: Jun 2000  |  IP: Logged
Lex
Uber Geek
Member # 835

Member Rated:
5
Icon 1 posted October 16, 2003 12:11      Profile for Lex   Author's Homepage     Send New Private Message       Edit/Delete Post 
Your Data Structures and Algorithms class sounds a lot cooler than mine. Here we are, maybe 8 weeks into the semester, and we're learning about what a tree structure is. We get projects to do and two whole weeks to do them in. I do them in a day, and it only takes that long because its boring. Then again, I guess they have to tone it down. Generally about two days before the project is due, lots of people are asking each other things like "Hey, did you figure out how to use that hashtable thing?" We don't even have to implement the actual structures, just know how to use the provided implementations.

--------------------
Your conviction that there is a monster under the bed would be a mere eccentricity if you weren't so heavily armed and it was your own bed.

Posts: 977 | From: University of Florida | Registered: Jul 2001  |  IP: Logged
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 16, 2003 13:24      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
quote:
Originally posted by quantumfluff:
FYI: SPARC is big-endian, so that casting the address of a long to (char *) and looking at that will show the high byte first. A 1 will be three bytes of zeros and finally a 1. The spec for the mips object file will define how it wants it's numbers to come in.

Even if there's no way to avoid the shifting, you can ecapsulate it in macros of functions. You should never see it sprinkled in the code.

MIPS is also Big-endian... Hence why most people will probably store the instruction in the long. The fact that the size of the classes are 2 words (instead of one) makes me think that other class data is being interpreted in the cast, and all I will have to do is seperate functionality and data...

The out put is a little fish to fry, and I just wanna get it out of the way. The labeling scheme I have is quite conventional (but I still think it's neat), and I won't bother about wether or not it's pretty, I just want it to work.... I can worry about being pretty as we near turnin.

Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 16, 2003 13:46      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
quote:
Originally posted by Lex:
Your Data Structures and Algorithms class sounds a lot cooler than mine. Here we are, maybe 8 weeks into the semester, and we're learning about what a tree structure is. We get projects to do and two whole weeks to do them in. I do them in a day, and it only takes that long because its boring. Then again, I guess they have to tone it down. Generally about two days before the project is due, lots of people are asking each other things like "Hey, did you figure out how to use that hashtable thing?" We don't even have to implement the actual structures, just know how to use the provided implementations.

We are doing the same thing in the lectures... just got to what a Red-Black tree is. But in Computer Programming II, we all wrote a dynamic array, linked-list, linked list with external iterators, binary search tree... Now we are looking at theory in class and writing this beast for homework.

Hashing hasn't been covered yet, except that you put in a "key" and get back a "value"... I've seen it from my personal reading, but lectures are still Data structures lectures.

--------------------
My Site

Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged
The Famous Druid

Gold Hearted SuperFan!
Member # 1769

Member Rated:
4
Icon 1 posted October 16, 2003 14:44      Profile for The Famous Druid     Send New Private Message       Edit/Delete Post 
quote:
Originally posted by GameMaster:
So, then if I rewrite the the int declarations I can cast the whole object as a char* and use print with sizeof() (the object will be 32 bits long, always.

You can't write the whole object, as it also contains some C++ administrivia (those virtual methods will be implimented as a pointers to functions which get stored in each instance of the object). That's why you wrote too many words.

Make a local struct containing just the stuff you want to write out, and have your write() method write it, and only it. And check that the problems others mentioned about endianness etc haven't bitten you, you may need to play around with the order of declarations in the struct.

--------------------
If you watch 'The History Of NASA' backwards, it's about a space agency that has no manned spaceflight capability, then does low-orbit flights, then lands on the Moon.

Posts: 10668 | From: Melbourne, Australia | Registered: Oct 2002  |  IP: Logged
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 16, 2003 20:35      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
MIPS can be both set to either, so endianess is a problem... I'm going to have to shift and OR.

--------------------
My Site

Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged
quantumfluff
BlabberMouth, a Blabber Odyssey
Member # 450

Member Rated:
5
Icon 1 posted October 17, 2003 07:48      Profile for quantumfluff     Send New Private Message       Edit/Delete Post 
'told you so [Big Grin]

Just look at it this way. Object code is messy stuff. It's not supposed to be clean to look at, but easy for the hardware to work with and not too inefficient for loaders to resolve dynamic symbols against. Also, since smaller is better, they are going to have to pack in the bits. But this is OK. Only one in 10K programmers ever works on the object code level - they should be good enough to slog through it.

Think of your car. A modern internal combustion engine is a remarkably complex device designed by highly trained engineers. Most people only see the ignition switch and accelerator.

Of course, what's really fascinating is to compare instruction formats for various types of machines and see how complexity changes over time as we moved along the CISC/RISC and hardwired vs. microcoded dimensions. (Well, fascinating to me, but most of your eyes are probably glazing over right now). Take an old architecture like the DEC PDP-10. A few hours and you can read an octal dump and convert to assembler in your head because it's so nice and regular. 9 bits for opcode, 4 each for registers, an indirect address bit and 18 for address/data. It was a totally hard-coded machine, so the opcode could be further divided into chunks which got to major opcode and then modifiers to it. It was so simple that it had dozens of effective NOP instructions, because you could do things like move from one register into itself - in 10 different ways. Still, it was real nice for people who wrote in assembler - and there were many at the time - because it was so expressive.

When they started microcoding machines, instruction sets could get really hairy complex. Opcode were just pointers into a microcode program, so there was no need for any regularity. It was a real nightmare for the assembler writer, because you could have variation anywhere.

RISC processors moved us back to more regular instruction sets (that's why your MIPS has fewer formats than an Intel chip). But, since people were not supposed to program in assembler any more, they made the instructions more like microcode. The work of picking the right sequence of instructions is now in the compiler.

Posts: 2901 | From: 5 to 15 meters above sea level | Registered: Jun 2000  |  IP: Logged
GameMaster
BlabberMouth, a Blabber Odyssey
Member # 1173

Member Rated:
4
Icon 1 posted October 17, 2003 08:04      Profile for GameMaster   Author's Homepage     Send New Private Message       Edit/Delete Post 
quote:
Originally posted by quantumfluff:
'told you so [Big Grin]

Just look at it this way. Object code is messy stuff. It's not supposed to be clean to look at, but easy for the hardware to work with and not too inefficient for loaders to resolve dynamic symbols against. Also, since smaller is better, they are going to have to pack in the bits. But this is OK. Only one in 10K programmers ever works on the object code level - they should be good enough to slog through it.

Think of your car. A modern internal combustion engine is a remarkably complex device designed by highly trained engineers. Most people only see the ignition switch and accelerator.

Of course, what's really fascinating is to compare instruction formats for various types of machines and see how complexity changes over time as we moved along the CISC/RISC and hardwired vs. microcoded dimensions. (Well, fascinating to me, but most of your eyes are probably glazing over right now). Take an old architecture like the DEC PDP-10. A few hours and you can read an octal dump and convert to assembler in your head because it's so nice and regular. 9 bits for opcode, 4 each for registers, an indirect address bit and 18 for address/data. It was a totally hard-coded machine, so the opcode could be further divided into chunks which got to major opcode and then modifiers to it. It was so simple that it had dozens of effective NOP instructions, because you could do things like move from one register into itself - in 10 different ways. Still, it was real nice for people who wrote in assembler - and there were many at the time - because it was so expressive.

When they started microcoding machines, instruction sets could get really hairy complex. Opcode were just pointers into a microcode program, so there was no need for any regularity. It was a real nightmare for the assembler writer, because you could have variation anywhere.

RISC processors moved us back to more regular instruction sets (that's why your MIPS has fewer formats than an Intel chip). But, since people were not supposed to program in assembler any more, they made the instructions more like microcode. The work of picking the right sequence of instructions is now in the compiler.

Actually, it shouldn't be an edianness problem, but it the only thing that I can think of. The shifitng of bits will be easy... Is a shift less overhead then a multiply? I assume so, because most modern archtectures use Booth's, but I don't know how Sparc does it.
Posts: 3038 | From: State of insanity | Registered: Mar 2002  |  IP: Logged


All times are Eastern Time  
  New Poll   Close Topic    Move Topic    Delete Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | Geek Culture Home Page

2015 Geek Culture

Powered by Infopop Corporation
UBB.classicTM 6.4.0



homeGeek CultureWebstoreeCards!Forums!Joy of Tech!AY2K!webcam