« May 2007 | Main | July 2007 »
June 30, 2007
Outschmuck
I already wrote that windiot Eugene Siciunas wants to put the whole mail @utoronto.ca on Megacrap Exchange server.
His soulmates at University of Bologna already did this. Result? Enormous flow of outgoing spam and certain services (like AOL) deamed the whole domain unibo.it as the source of spam and refuse accept email originating from it.
Unfortunately it affects also the Department of Mathematics despite the fact it runs its own UNIX based mail servers. But they are collateral damage. Should we suffer as well?
Posted by Victor at 10:55 AM
Spammed! - Again
After certain break in the bouncing back emails with my spoofed address, there flow started again. And again this crap originated from Windoze crapputers using M$ Outluck of different brands!
Posted by Victor at 10:48 AM
Script Junkies!
(Open Letter to GreenShields, Canada)
Dear Sir/Madam,
your new website is extremely buggy. More precisely: buggy is javascript. It works with Microcrap Internet Exploder and with Opera, it does not work with Firefox and Safari and it is slow.
It means that javascript is very proprietary and platform specific which shows that your web programmers are bloody amateurs who graduated from some crappy community college and should not be hired by (should be fired from) any reputable company.
Should I say, these script junkies fit only to develop pornosites?
Posted by Victor at 10:41 AM
June 27, 2007
Great Job, Google!
This is not the first time when I am getting phishing email to my gmail account. Every time it is not just going to spam folder but displays an alert
Warning: This message may not be from whom it claims to be. Beware of following any links in it or of providing the sender with any personal information. Learn more. |
This is superb privacy protection, way better than any other free (or paid) email provider offer.
Posted by Victor at 10:00 PM
Spammed!
Today I am getting enormous number of messages (well, all of them go to spam.gz automatically) from mailer-daemons and postmasters as "Undelivered messages".
They are coming from completely different networks but all of them (as analysis shows) are "my" bounced back messages.
More precisely, these "my" messages are usual spam crap with spoofed my return address:
Apparently this someone actually is a widely distributed in the space villain: all the original messages are coming from very different computers/networks, names vary too (but as far as I noticed all have numerical component) and they are different in content.
The is only one common denominator: all this crap originates from Windoze computers using either
M$ Outluck Excess or (rarely) Bat:
From: "Socorro Crowder16450" <ivrii@math.toronto.edu>
To: <pjk11@scasd.k12.pa.us>
Subject: Socorro, We know what the women want.
Date: Wed, 27 Jun 2007 22:04:10 -0200
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0006_01C7B920.3C8500F0"
X-Mailer: Microsoft Office Outlook, Build 11.0.5510
Thread-Index: Aca6Q375OMXIDHESU23ZTTTTT739IB==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1158
Message-ID: <01c7b907$1737c8f0$ad01cac4@ivrii>
Probably the good network policy would be that Messages from windoze computers sent to wrong address should not bounce to the sender but just discarded and no notification should be sent unless the declared return address matches to the real domain in 'Receiced' field
Posted by Victor at 05:56 PM
June 23, 2007
Digitalizing an Old Book. II
Converting to Djvu
I scanned Russian math book (310 pp) and got 23 MB pdf file (155 double pages). My experiments show that
- One should not try to decrease further the size of this foo.pdf file since it will increase the size of the resulting djvu file;
- The optimal way is to run djvudigital on my Mac which produced 7.2 MB foo.djvu;
- And after transfer file to VPC and run Lizardech DjvuExpress Trial for OCR which increased file to 8.1 MB;
- Despite the Russian text OCR was pretty good which demonstrates superiority of built-in ABBYY Fine Reader OCR engine (any2djvu.djvuzone.org cannot handle non-English text well.
My later experience:
One needs to remember: for serious scanning you need a serious scanning s/w; scanning directly to Acrobat is not an appropriate for the serious job:
- Scanning preferences of Acrobat are not in Preferences; when you select File > Create PDF > From Scanner you can adjust Image Settings: Compression (Color/Grayscale, Monochrome, Size/Quality) and Filtering (Deskew, Background removal, Edge Shadow removal, Despecle, Halo removal). So no way to select type of material, scanning area, type of scanning (color, gray, black/white; as the result scanner tries to preview each page before scanning and determine its type and geometry and select an appropriate mode. This makes the process way longer and the guess is often wrong. Scanning directly to Acrobat works well if you want to scan few separate pages rather than a book, and these pages are black and white (not yellow due to old age)
- Also Acrobat itself can transform color pdf to grey but to make it b/w one needs third party utilities (standalone applications or Acrobat plugins) which are much more expensive than the good scanning s/w.
- On the other hand, using vuescan (available for Mac/Linux/Windows) or other good scanning s/w I can select geometry, type of scanning (b/w) and the white/black threshold manually (but I can change it for each page) and then no need for preview (so process is way faster). Also resultion (300 dpi recommended, everything above 600 dpi is downsampled to 600 dpi for OCR) precisely rather than use slidebar on quality/size scale.
- To make things worse Acrobat scans blindly: you do not see what you got until you finish the process. Further, it does not save - you need to finish scanning and save. Sure you can interrupt scanning and see/save the result but it makes the process even longer and more cumbersome.
- In the contrast, vuescan shows you each page (so you can change black-white threshold) and saves automatically when you move to the next page. It saves each page as a separate file (pdf/jpeg/tiff/raw) with the default names crop0001.pdf, crop0002.pdf, … which one can easily and automatically combine using either Acrobat or Ghostscript (v. 8.5 is fine)
Posted by Victor at 03:37 AM
Digitalizing an Old Book
I have a book published in 1985 in LNM, and it is basically a xerox copy of the manuscript, with typewritten text and handwritten formulae. It was 242 pp. So I decided to digitalize it. Previously I copied my book each time covering an opposite page so each page contained no parasite text. Rather unpleasant but unavoidable job. Then I scanned it to Acrobat 7. Unfortunately ADF on my scanner broke and ADF at library gone long ago, so the job was not extremely pleasant, especially because Acrobat each time previewed page and automatically determined if it is BW document, or BW picture, or text/in-line art (and I listed only choices were made). The result was 56 MB file.
Then I ran OCR which increased it slightly and also OCR was very timid: certain clearly pieces of text were not OCRed. I tried ReadIris 9 (which IMHO is a superior OCR s/w) but it was too aggressive and tried to OCR even formulae replacing unrecognized characters by ~. Not good.
Converting document to B/W, cropping out margins and setting compatibility level only with pdf 1.6 (aka Acrobat 7) I reduced document drastically to 11 MB but it could not be handled by earlier Acrobat or by Ghostscript 8.51 and thus was not usable by itself for further transformations. And it was poorly OCRed and not very nicely looking. I converted it to 17 MB postscript file using Acrobat 7.I converted then ps to djvu using djvulibre converters installed on my Mac but there is no OCR. So instead I used a trial version of LizardTech Djvu Document Express Pro = Djvu Editor Pro 5.0. It was many hours job! However in the end I got 14 MB djvu document which was better looking than the original pdf and had much supeior OCR (I think LizardTech uses ReadIris OCR engine). I also inserted clickable links into the table of content using the same Djvu Editor Pro.
All this was wrong approach. Later I did a correct job and the current digitalization is the result of this better approach.
Posted by Victor at 03:29 AM
June 21, 2007
Darwin theory
Recently City Council of Toronto discussed garbage and recycling in Toronto. Basically these decisions were just cash grab from Toronto residents. However one of the decisions was different: "A motion to improve green bins so that they can “no longer be opened by raccoons,” passed by a vote of 43 to 2".
What the hell? These bins were supposed to be raccoon proof but racoons are much smarter than the city engineers and they (raccoons, not city engineers) quickly discovered methods to open these bins. There is a little doubt that raccoons will upgrade their skills to open improved bins as well.
Evolution theory predicts us the outcome of this competition: raccoons burglers easily breaking the most advanced safes. Such raccoons could be employed by criminals.
Posted by Victor at 04:05 PM
June 14, 2007
Allied Bombing Campaign in WWII
Canadian War Museum exhibition suggest that allied bombing campain failed to reduce significantly military production in Germany. This is a fallacy: at this period military production in allied countries grew significantly and there is no indication that without bombing military production in Germany and Japan would not grow as well.
So, Allied Bombing not only prevented the significant growth of the military production in Germany and Japan but also reduced it.
Posted by Victor at 10:13 PM