Python fuzzy binary/ascii file difference checker function

Completed Posted Sep 3, 2007 Paid on delivery
Completed Paid on delivery

Python file compare routine. I need a small Python function to compare similar files. The routine should examine and compare file A and B, returning true or false if they match or not. The trick I need is that I know for sure the files will never be exactly identical. They are PDF files which are part ASCII and part binary streams. The files will normally be different in three or more areas, because of date stamps and ID tags in the files. If these are the ONLY differences, I need the filter to return True, ignoring these small differences. The three areas are all in ASCII sections of the PDF, so are easy to spot. Basically, lines such as: /CreationDate (D:********) /M (D:***********) /ID [ *********** ] There might be more than one /M line. I want these three cases IGNORED in the file comparison test. I can't use a simple ASCII loop because of the presence of stream binary-stuff.... endstream Differences in any other area, including the binary streams, should return false. Hopefully this is a simple job for someone who codes python night & day. It will go into an existing python regression test suite. I don't need a full technical document report, just a working python routine with good in-line comments. I prefer it does not depend on extra packages, just what is native in current python installs. Oh, and I need it to work on Python 2.5, windows XP environment. I've attached file A & B samples plus a difference report so you can see exactly what the problem is before bidding. Thanks to everyone that bids, I need some help!

## Deliverables

1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.

2) Deliverables must be in ready-to-run condition, as follows? (depending on the nature? of the deliverables):

a)? For web sites or? other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software? installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

## Platform

Python 2.5, Windows XP, Adobe PDF

Engineering MySQL PHP Python Software Architecture Software Testing

Project ID: #3263371

About the project

4 proposals Remote project Active Sep 4, 2007

Awarded to:

testpulsevw

See private message.

$63.75 USD in 10 days
(102 Reviews)
7.5

4 freelancers are bidding on average $52 for this job

jantomka

See private message.

$34 USD in 10 days
(25 Reviews)
4.5
luque

See private message.

$38.25 USD in 10 days
(14 Reviews)
4.3
ziliaidis

See private message.

$72.25 USD in 10 days
(10 Reviews)
3.9