February 28th, 2007

A task arose recently from one my clients requesting me to help process some binary files they had. They needed me to create a new file based on their input file with where all occurrences of a specific byte get replaced with a different specific byte. No problem I thought, but deciding which method to use to perform the reading and replacing of the values proved to be not immediately obvious.

There are a several approaches to reading and writing files, as well as several approaches to the replacing of values, in fact Java almost always offers a multitude of solutions to a problem.

So i decided to embark upon a little experiment to see what I could find to be the most efficient method.

Firstly I decided to create my own test data. To do so I just simply created a file containing many random bytes. I chose a random number between 0 and 32 as that most closely matched my original client requirement. CreateTest.java will create a 10mb file.

I wrote 4 little tests during the experiment, trying String replacement, byte comparison and replacement and buffered readers/writers. The result surprised me.

Method 1.

My first method I decided to read in the data into a 4k buffer using a FileInputStream. I then create a new String object based upon the bytes read and replace the old value with the new value using String.replace(). Then the bytes from the String are written to a FileOutputStream.

Method 2

The second method I simply wondered if I could save any processing time by doing all the functions of Method1 for comparison and replacing on as little code lines as possible, as I expected this did not make any real difference.

Method 3

For method 3 I used the same method of reading the file by reading into a buffer using FileInputStream but rather than use the String replace function, I first compare the bytes and then write either the old or the new value into a second buffer. The result of this change was encouraging, the time taken to process the file was about 20-40% the time it took using the String object.

Method 4

Feeling encouraged by the results in method 3, I thought now if I use the Buffered Input & Output Stream objects java gives us, then I was sure to reap extra benefits, But the time taken rose dramatically.

Timing Results.

To come up with these results, I ran the classes from the command prompt 10 times each and recorded the average time taken.

Method1 : 500 ms
Method2 : 700 ms
Method3 : 150 ms
Method4 : 2800 ms

I wondered if I could tweak the buffer size in Method 3 to see what difference that makes when processing the files, To be honest I don’t know why I picked 4kb in the first place, it just happens to be habit when defining buffer sizes.

I tried 1k, 16k, 32k, 100k, and 512k.

With a 1k buffer the average time was 280 ms.
A 16k buffer gave me an average of 90 ms.
A 32k buffer gave me an average of 78 ms.
A 100k buffer averages out at 125 ms.
And the 512k buffer was back to roughly 200ms.

So using a 32k buffer gave me the best results.

Conclusion.

I was a little surprised that the time taken rose so dramatically when using the buffered reader and writer object. I must admit my understanding of the buffering objects is not at an expert level but I did expect to see improvement, else why use them if controlling the buffer myself proves to be so much more efficient.

Below you will find all the code used in these tests:

/*
* CreateTest.java
*
* Created on 04 February 2007, 11:19
*
* Purpose: To create a binary test file used in file processing tests.
*/

import java.io.FileOutputStream;
import java.util.Random;

/**
*
* @author DAVE
*/
public class CreateTest {

static final long fileSize = 10 * 1024 * 1024; // Lets make it 10mb
static final String outFile = “testfile.dat”;
private static Random rn = new Random();

/** Creates a new instance of CreateTest */
public CreateTest() {
}

public static void main(String[] args) throws Exception{
// Open file for output
FileOutputStream out = new FileOutputStream(outFile);

for (int i = 0; i < fileSize; i++) {
// Get Random int between 0 & 32
int idx = rand(0,32);
out.write(idx);
}
out.close();
System.out.println(“testfile.dat created.”);
}

// get Random number
private static int rand(int lo, int hi) {
int n = hi – lo + 1;
int i = rn.nextInt() % n;
if (i < 0)
i = -i;
return lo + i;
}
}

/*
* Method1.java
*
* Created on 02 February 2007, 17:49
*
* Process binary file, byte by byte performing a replace
* using Strings & replace() function.
*
*/

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.Date;

public class Method1 {

public Method1() {
}

public static void main(String[] args) throws Exception{
Date startTime = new Date();
Date endTime;

String inFile = “testfile.dat”;
String outFile = “outputfile.dat”;

char oldValue = 0x00;
char newValue = 0xFF;
final int bufferSize = 4 * 1024; // 4kb buffer
byte[] buffer = new byte[bufferSize];

FileInputStream in = new FileInputStream(inFile);
FileOutputStream out = new FileOutputStream(outFile);

String s = null;
int read = in.read(buffer);
while (read >= 0) {
if (read > 0) {
s = new String(buffer);
if (s.length() != read) {
s = s.substring(0,read);
}
s = s.replace(oldValue,newValue);
out.write(s.getBytes());
}
read = in.read(buffer);
}

out.close();
in.close();
endTime = new Date();

System.out.println(” Method 1 – time taken (ms) : “+(endTime.getTime() – startTime.getTime()));

}
}

/*
* Method2.java
*
* Created on 02 February 2007, 17:49
*
* Process binary file, byte by byte performing a replace
* using Strings & replace() function.
*/

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.Date;

public class Method2 {

public Method2() {
}

public static void main(String[] args) throws Exception{
Date startTime = new Date();
Date endTime;

String inFile = “testfile.dat”;
String outFile = “outputfile.dat”;

char oldValue = 0x00;
char newValue = 0xFF;
final int bufferSize = 4 * 1024; // 4kb buffer
byte[] buffer = new byte[bufferSize];

FileInputStream in = new FileInputStream(inFile);
FileOutputStream out = new FileOutputStream(outFile);

int read = in.read(buffer);
while (read >= 0) {
if (read > 0) {
out.write(new String(buffer,0,read).replace(oldValue,newValue).getBytes());
}
read = in.read(buffer);
}

out.close();
in.close();

endTime = new Date();
System.out.println(” Method 2 – time taken (ms) : “+(endTime.getTime() – startTime.getTime()));

}
}

/*
* Method3.java
*
* Created on 02 February 2007, 17:49
*
* Process binary file, byte by byte performing a replace
* using byte comparison only
*/

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
import java.util.Date;

public class Method3 {

public Method3() {
}

public static void main(String[] args) throws Exception{
Date startTime = new Date();
Date endTime;

String inFile = “testfile.dat”;
String outFile = “outputfile.dat”;

byte oldValue = (byte)0x00;
byte newValue = (byte)0xFF;

final int bufferSize = 32 * 1024; // 32kb buffer

byte[] buffer = new byte[bufferSize];
byte[] cBuffer = new byte[bufferSize];

FileInputStream in = new FileInputStream(inFile);
FileOutputStream out = new FileOutputStream(outFile);

int read = in.read(buffer);
while (read >= 0) {
if (read > 0) {
for (int i = 0; i < read; i++) {
if (buffer[i] == oldValue)
cBuffer[i] = newValue;
else
cBuffer[i] = buffer[i];
}
out.write(cBuffer,0,read);
}
read = in.read(buffer);
}

out.close();
in.close();
endTime = new Date();
System.out.println(” Method 3 – time taken (ms) : “+(endTime.getTime() – startTime.getTime()));
}
}

/*
* Method4.java
*
* Created on 02 February 2007, 17:49
*
* Process binary file, byte by byte performing a replace
* using buffered streams
*/

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.Date;

public class Method4 {

public Method4() {
}

public static void main(String[] args) throws Exception{
Date startTime = new Date();
Date endTime;

String inFile = “testfile.dat”;
String outFile = “outputfile.dat”;

byte oldValue = (byte)0x00;
byte newValue = (byte)0xFF;

BufferedInputStream bis = null;
BufferedOutputStream bos = null;

bis = new BufferedInputStream(new FileInputStream(inFile));
bos = new BufferedOutputStream(new FileOutputStream(outFile));

int theByte;
while ((theByte = bis.read()) != -1) {
if (theByte != oldValue)
bos.write(theByte);
else
bos.write(newValue);
}

bos.close();
bis.close();
endTime = new Date();
System.out.println(” Method 4 – time taken (ms) : “+(endTime.getTime() – startTime.getTime()));
}
}

February 16th, 2007

Right now, my stress levels are up, i’m pretty agitated and could easily have ended up answering to the police if by any chance I had in my possession a firearm of sorts…

Returning now from dinner with my wife, I was driving my wifes very small Daewoo Matiz.

It’s small, in fact it’s tiny.

BUT that is no need for people to automatically assume that the driver of the car is some frail old lady who will happily slow right down and pull over onto the nearest place to let whomever it is driving up their rear end overtake or undertake to complete what seems to be a mission to prevent imminent Armageddon !

Within the distance of 2 miles along dual carriageways, I was turning right around a roundabout and on the opposite side to my entry was a transit van who was coming fast and REALLY wanted to avoid coming to a full stop at all costs, so they just kept coming and must have missed my rear end by a fraction before overtaking and flying past.

2 minutes later we were taking the slip road off a dual carriageway onto a roundabout, again to turn right, to have someone right up my ass on the slipway, On the roundabout, even though I was indicating they decided they MUST be in front of the small insignificant Daewoo so undertook me right before my exit. That kind of driving makes me sick, and not wanting to sound old, but is indeed why their are so many accidents. And that is why I would have loved to do something put the asshole back in their place.. If not a weapon to fire into the air, then at least some blue flashing lights to put on the roof in time of need, although how convincing would a Matiz be at being an undercover cop. Hmmm

What makes my blood boil is the fact I have never encountered these idiots on the road when I’ve been driving my past vehicles, BMW, Pajero, Vectra and others all much large that the Matiz so it leads me to think that people have a general view that they see the small car and immediately assume the driver will see them behind and kindly pull over to let them past..

Grrrrr

February 15th, 2007

Today I decided to install Postgrey on my mail server to help reduce the mass influx of crap I get in my mailbox on a daily basis. I’m not the type of person to have only one email address, I feel the need to use a catch all as I despise the thought of losing emails. And yes there are several thousand items in my inbox I haven’t brought myself to removing.

Anyway I’ve had it running a couple of hours now and it looks to be pretty efficient in its job and its FAR easier to setup than spamassassin etc.

What is it ?

In a nutshell, Postgrey rejects incoming emails. Yes that sounds drastic but a legitimate mail server will resend the message where as the mailservers spammers use tend to send out messages in such bulk they want to get the job over and done with as quickly as possible and don’t care if they miss a few recipients. Thats the idea anyway.
On with the setup ..

Firstly I am using postfix on a ubuntu 6.06 based server & assume all instructions are executed as sudo.

1. Install with apt-get

apt-get install postgrey

2. Edit the postfix configuration so it will work with postgrey. I use nano as an editor.

nano /etc/postfix/main.cf

Find the line – smtpd_recipient_restrictions =

Add the following lines, you may already have the first, I did.

reject_unauth_destination
check_policy_service inet:127.0.0.1:60000

3. Alter postfix settings (Optional)

By default, postfix will delay an incoming message for 300s, I think 5 minutes is a bit steep so you can change this, although this may increase the chances of receiving spam. I changed mine to 1 minute.

nano /etc/default/postgrey

Find the line POSTGREY_OPTS=”–inet=127.0.0.1:60000 –delay=300″ and change 300 to 60 so you end up with POSTGREY_OPTS=”–inet=127.0.0.1:60000 –delay=60″

Next you may wish to specify certain email addresses or domains that you wish postgrey to ignore as you trust them. These configurations are in the whitelist file.

nano /etc/postgrey/whitelist_clients

If you trust every address from hotmail (EXAMPLE ONLY !) you can add @hotmail.com on its own line in this file.

4. Restart Everything

/etc/init.d/postgrey restart
postfix reload

Thats it. You can monitor your mail log using tail /var/log/mail.log -f whilst sending an email to yourself from hotmail for example, and you should see the mail being rejected to make sure its working.


About Me

Welcome to my blog. Here you'll find mainly work related content, a suppository for all my notes which otherwise would end up on a post it note.