String comparison in Delphi

Have you ever wondered how utilities like Beyond Compare or DIFF are comparing files? They do it (I guess) by solving the longest common subsequence (LCS) problem.

After reading the Wikipedia article linked above, I obtained an overall view of the problem and I looked at the possible resolutions. So, I decided to implement a Delphi class to do the string comparison trick, which is the base for the text file comparison.

Let me put it as follows: given two strings to be compared, I want to highlight in blue the characters added to the first string and in red the characters removed from it. The common (unchanged) characters will keep the default color.
 
For example:

String 1 = Delphi allows both structural and object oriented programming.

String 2 = Does Delphi allow object oriented programming?

Highlighted differences:

Does Delphi allows both structural and object oriented programming.?

The Delphi class looks like this:

type
  TDiff = record
    Character: Char;
    CharStatus: Char;  //Possible values: [+, -, =]
  end;

  TStringComparer = class
  ……………
  public
    class function Compare(aString1, aString2: string): TList<TDiff>;
  end;

When you call TStringComparer.Compare, a generic list of TDiff records is created. A TDiff record contains a character and whether this character was added (CharStatus = ‘+’), removed (CharStatus = ‘-’) or unchanged (CharStatus = ‘=’) in both strings under comparison.

Let’s drop two edits (Edit1, Edit2), a rich edit (RichEdit1) and a button (Button1) on a Delphi form. To highlight the differences put the following code in the OnClick event of the button:

procedure TForm1.Button1Click(Sender: TObject);
var
  Differences: TList<TDiff>;
  Diff: TDiff;
begin
  //Yes, I know...this method could be refactored ;-)
  Differences:= TStringComparer.Compare(Edit1.Text, Edit2.Text);
  try
    RichEdit1.Clear;
    RichEdit1.SelStart:= RichEdit1.GetTextLen;
    for Diff in Differences do
      if Diff.CharStatus = '+' then
      begin
        RichEdit1.SelAttributes.Color:= clBlue;
        RichEdit1.SelText := Diff.Character;
      end
      else if Diff.CharStatus = '-' then
      begin
        RichEdit1.SelAttributes.Color:= clRed;
        RichEdit1.SelText:= Diff.Character;
      end
      else
      begin
        RichEdit1.SelAttributes.Color:= clDefault;
        RichEdit1.SelText:= Diff.Character;
      end;
  finally
    Differences.Free;
  end;
end;

It looks like in the image below:


For the full implementation read further down. Note that various optimizations could be added to the code below, but I didn’t implement them. Anyway, I hope this helps. Feedback is welcome! Feel free to find and correct bugs ;-)

Testing the World Away: Recovery mission

I was recently reviewing the DUnit website and I noticed there is a broken link to an article titled “Testing The World Away”. It was written by Will Watts for QBS Software. November, 2000.

I said “OK, maybe the article was relocated somewhere else in the QBS Software website”; so I tried a custom Google search "Testing the World Away" site:qbssoftware.com. As you can see the article was either banned from Google or removed completely from the QBS Software website.

Once again I said “OK, maybe there’s a copy of the article somewhere else on the Internet” and I tried a second custom Google search "Testing the World Away". At this point I convinced myself that the article was gone for good.

I am a curious guy, so I tried one final thing: I looked up the broken link[1] in the Internet Archive website and wallah!, they came with an archived version of the article.

I have shared below a copy of the article so that we can take a look. As I said, this article is not mine, and if the author(owner) at some point request me to deleted it from my blog, I will do so.

[1] http://www.qbss.com/html/news/news_body.asp?content=ARTICLE&link=368&zone= 

Testing the World Away (01 November 2000)

Testing the World Away

The software methodology of the hour is Kent Beck’s ‘Extreme Programming’. Mr Beck is a Smalltalk programmer by trade and, I think, a bit of a lad by inclination (evidence: the bibliography of his book Extreme Programming Explained, as well as citing standards such as The Mythical Man-Month and Design Patterns, also recommends Cynthia Heimel’s Sex Tips for Girls. Right on!). I find some of his ideas unconvincing. Pair Programming for example, where one programmer sits at the keyboard and works while the other does something else - possibly flower-arranging, my concentration lapsed at this point in the text as I tried to imagine any manager I’ve met who would permit this exciting way of increasing his costs - seems too beautiful and delicate for our mortal coil. On the other hand his approach to testing, and the free libraries based on his design that are around to back it up, comes much nearer to hitting, as I suppose Ms Heimel might put it, my programming G-spot.

Multiple Entries to the US with the same I-94 Form: Just for Canadian Residents

I am a Permanent Resident of Canada (Landed Immigrant) and I got a B-2 Visitor Visa which I used in order to travel to the United States of America for 15 days.

I crossed the border at the Rainbow Bridge at Niagara Falls. In the American side of the bridge there is a US Point of Entry, in which I was requested for my papers. I presented my Cuban Passport with a one time B-2 Visitor Visa to the American Officer controlling the crossing. The officer kept my passport and redirected me to one office located just a few meters further.

I waited for 40 minutes and I was called for an interview with another officer. He asked for the purpose of the trip, the intended duration of the stay, my destinations within the States, the means of returning back (flight and bus tickets) [1], my relation to the people I was visiting in the US, and things like that.

The officer issued an I-94 Form, that he stapled onto my passport. I paid a $6.00 USD fee for the I-94 Form and my passport was returned to me. - Make sure you have exactly $6.00 USD (cash) if you want to get this done quickly. You can pay with credit card as well, but it takes more time-

The I-94 Form was issued for 6 moths, meaning that I was permitted to stay in the US until the expiry date, 6 months later.


In my way back to Canada I crossed the Rainbow Bridge again (in the opposite direction). This time I was stopped by a Canadian Custom Officer. She asked for my papers: I gave her my passport and my Canadian Resident Card [2]. She asked pretty much the same questions that the American Officer did, and she also inquired about any merchandize that I was bringing from the US.

I was expecting her to remove the I-94 from the passport, but she didn’t. Usually, you need to surrender your I-94 when leaving the US, but it seems that Canadian Residents can use the same I-94 Form (if not expired) to re-entry to the US multiple times. It seems that you can do this regardless of the validity of your Visa: the only thing taken into consideration is that the I-94 Form has to be valid.

Looking for extra validation of my theory I found this post. I am extracting the main juice below:

You don't have to get a new I-94 every time
you enter the USA. How do I know ?, because
I've being traveling to the USA for the past
4.5 years and using the same I-94 until its
expiry date. In fact the INS officer told me
that I did not have to turn in the I-94 if I
intent to enter the USA prior to the expiry
date.

I also talked to a few friends who confirmed that they used the I-94 Form to re-enter the US from Canada, even after their visa was expired. 

I looked furthermore to validate this theory and I found out that:

If taking short trips (30 days or less) to Canada, Mexico, or the Caribbean Islands during the course of your visit to the U.S., hold onto your I-94 or I-94 (W); it should only be turned in when you leave the U.S. to return home. [U.S. Department of State]

After this, I was pretty confident of my theory and I went to the US a second time. This time I crossed the border at the Peace Bridge. I was not expecting any problems: my visa was expired, but my I-94 was not. You know what? I did have a problem…

The American officer told me that they allow most people visiting the US to re-enter by just using the I-94 (if not expired). The problem in my case is that I am a Cuban national. Cubans (among other nationalities) are taken especial care by the US government. That meant that I was not able to use my I-94 Form to re-enter the US.

I was very disappointed, but the American officer wanted to support my case. She noticed that I made an honest mistake. So, she took my finger prints, made me fill some forms and with that, she supported my case with her superiors. Not even her immediate superior was able to allow me enter the US. So, the whole process took like 3 hours and of course, I lost my bus and the plane.

After all this, they gave me a waiver, which is a one time permit to enter the US without a visa. I was glad at this point, because those Americans officers helped me while they were enforcing the law. They removed the original 6 months I-94 from my passport, and they re-issued a new one for just one week (I was going this time for the weekend; so it was good enough for me).

They explicitly told me to surrender this new I-94 Form to the Canadian custom officer in my way back to Canada. I did that of course. Note that you have to advise the Canadian officer to remove the I-94 from your passport. The Canadian officer does not remove it if you don’t ask.

So, the conclusion if that you can use the I-94 Form (if not expired) to re-enter the US from Canada, but this does NOT apply to nationals of Cuba, Iran, Syria, and Sudan. This list could change at any moment. 

Disclaimer: This is based on my own experience as a Permanent Resident of Canada. You should not consider this a legal advice or whatsoever.

Just one more thing, after all this hassle I got a new B2-Visitor Visa to enter the US. So, I will be going soon to the States. This time I don’t expect any problems :-)

If you find this post informational, please, share it with others: just click the Google+ button at the start of this article.
[1] I was traveling from Toronto to Buffalo by land; passing though Niagara Falls. Then, I was traveling from Buffalo to Miami by air. I used that route backwards in my way back.
[2] Don’t forget your Canadian Resident Card. You need it to re-enter Canada.

Delphi Implementation for the OpenSubtitles API

OpenSubtitles.org allows searching and hosting subtitles in several formats (SRT, SUB, etc.) and pretty much every language. It currently has a vast database of subtitles (expanding every day). OpenSubtitles.org also exposes a XML-RPC based API that can be used in order to build third party applications with subtitle features.

I am writing a Delphi app to search subtitles in the OpenSubtitles.org database... I thought it would be nice to have a Delphi wrapper for the whole API. Below is my three cents contribution. I will probably implement and share more methods in the future, but feel free to contribute as well.  Take a look at the full API methods list

XML-RPC stands for XML Remote Procedure Call. It allows “remote procedure calling using HTTP as the transport and XML as the encoding”. [http://xmlrpc.scripting.com/spec]. XML-RPC is really easy to implement: in the code below I have used formatted strings to conform the XML requests (XML encoding pending) and the Indy TIdHTTP component to send the requests.

unit OpensubtitlesAPI;

interface

uses
  IdHTTP, Classes, SysUtils;

  function LogIn(aUsername, aPassword,
                 aLanguage, aUserAgent: string): string;
  function LogOut(aToken: string): string;
  function SearchSubtitles(aToken, aSublanguageID,
                           aMovieHash: string;
                           aMovieByteSize: Cardinal): string;  overload;
  function SearchSubtitles(aToken, aSublanguageID: string;
                           aImdbID: Cardinal): string; overload;
  function SearchSubtitles(aToken, aSublanguageID,
                           aQuery: string): string;  overload;

implementation

function XML_RPC(aRPCRequest: string): string;
const
  cURL= 'http://api.opensubtitles.org/xml-rpc';
var
  lHTTP: TIdHTTP;
  Source,
  ResponseContent: TStringStream;
begin
  lHTTP := TIdHTTP.Create(nil);
  lHTTP.Request.ContentType := 'text/xml';
  lHTTP.Request.Accept := '*/*';
  lHTTP.Request.Connection := 'Keep-Alive';
  lHTTP.Request.Method := 'POST';
  lHTTP.Request.UserAgent := 'OS Test User Agent';
  Source := TStringStream.Create(aRPCRequest);
  ResponseContent:= TStringStream.Create;
  try
    try
      lHTTP.Post(cURL, Source, ResponseContent);
      Result:= ResponseContent.DataString;
    except
      Result:= '';
    end;
  finally
    lHTTP.Free;
    Source.Free;
    ResponseContent.Free;
  end;
end;

function LogIn(aUsername, aPassword, aLanguage, aUserAgent: string): string;
const
  LOG_IN = '<?xml version="1.0"?>' +
           '<methodCall>' +
           '  <methodName>LogIn</methodName>' +
           '  <params>'   +
           '    <param>'  +
           '      <value><string>%0:s</string></value>' +
           '    </param>' +
           '    <param>'  +
           '      <value><string>%1:s</string></value>' +
           '    </param>' +
           '    <param>'  +
           '      <value><string>%2:s</string></value>' +
           '    </param>' +
           '    <param>'  +
           '      <value><string>%3:s</string></value>' +
           '    </param>' +
           '  </params>'  +
           '</methodCall>';
begin
  //TODO: XML Encoding
  Result:= XML_RPC(Format(LOG_IN, [aUsername, aPassword, aLanguage, aUserAgent]));
end;

function LogOut(aToken: string): string;
const
  LOG_OUT = '<?xml version="1.0"?>' +
           '<methodCall>' +
           '  <methodName>LogOut</methodName>' +
           '  <params>'   +
           '    <param>'  +
           '      <value><string>%0:s</string></value>' +
           '    </param>' +
           '  </params>'  +
           '</methodCall>';
begin
  //TODO: XML Encoding
  Result:= XML_RPC(Format(LOG_OUT, [aToken]));
end;

function SearchSubtitles(aToken, aSublanguageID, aMovieHash: string; aMovieByteSize: Cardinal): string;
const
  SEARCH_SUBTITLES = '<?xml version="1.0"?>' +
                     '<methodCall>' +
                     '  <methodName>SearchSubtitles</methodName>' +
                     '  <params>' +
                     '    <param>' +
                     '      <value><string>%0:s</string></value>' +
                     '    </param>' +
                     '  <param>' +
                     '   <value>' +
                     '    <array>' +
                     '     <data>' +
                     '      <value>' +
                     '       <struct>' +
                     '        <member>' +
                     '         <name>sublanguageid</name>' +
                     '         <value><string>%1:s</string>' +
                     '         </value>' +
                     '        </member>' +
                     '        <member>' +
                     '         <name>moviehash</name>' +
                     '         <value><string>%2:s</string></value>' +
                     '        </member>' +
                     '        <member>' +
                     '         <name>moviebytesize</name>' +
                     '         <value><double>%3:d</double></value>' +
                     '        </member>' +
                     '       </struct>' +
                     '      </value>' +
                     '     </data>' +
                     '    </array>' +
                     '   </value>' +
                     '  </param>' +
                     ' </params>' +
                     '</methodCall>';

begin
  //TODO: XML Encoding
  Result:= XML_RPC(Format(SEARCH_SUBTITLES, [aToken, aSublanguageID, aMovieHash, aMovieByteSize]));
end;

function SearchSubtitles(aToken, aSublanguageID: string;
  aImdbID: Cardinal): string;
const
  SEARCH_SUBTITLES = '<?xml version="1.0"?>' +
                     '<methodCall>' +
                     '  <methodName>SearchSubtitles</methodName>' +
                     '  <params>' +
                     '    <param>' +
                     '      <value><string>%0:s</string></value>' +
                     '    </param>' +
                     '  <param>' +
                     '   <value>' +
                     '    <array>' +
                     '     <data>' +
                     '      <value>' +
                     '       <struct>' +
                     '        <member>' +
                     '         <name>sublanguageid</name>' +
                     '         <value><string>%1:s</string>' +
                     '         </value>' +
                     '        </member>' +
                     '        <member>' +
                     '         <name>imdbid</name>' +
                     '         <value><string>%2:d</string></value>' +
                     '        </member>' +
                     '       </struct>' +
                     '      </value>' +
                     '     </data>' +
                     '    </array>' +
                     '   </value>' +
                     '  </param>' +
                     ' </params>' +
                     '</methodCall>';

begin
  //TODO: XML Encoding
  Result:= XML_RPC(Format(SEARCH_SUBTITLES, [aToken, aSublanguageID, aImdbID]));
end;

function SearchSubtitles(aToken, aSublanguageID,
  aQuery: string): string;
const
  SEARCH_SUBTITLES = '<?xml version="1.0"?>' +
                     '<methodCall>' +
                     '  <methodName>SearchSubtitles</methodName>' +
                     '  <params>' +
                     '    <param>' +
                     '      <value><string>%0:s</string></value>' +
                     '    </param>' +
                     '  <param>' +
                     '   <value>' +
                     '    <array>' +
                     '     <data>' +
                     '      <value>' +
                     '       <struct>' +
                     '        <member>' +
                     '         <name>sublanguageid</name>' +
                     '         <value><string>%1:s</string>' +
                     '         </value>' +
                     '        </member>' +
                     '        <member>' +
                     '         <name>query</name>' +
                     '         <value><string>%2:s</string></value>' +
                     '        </member>' +
                     '       </struct>' +
                     '      </value>' +
                     '     </data>' +
                     '    </array>' +
                     '   </value>' +
                     '  </param>' +
                     ' </params>' +
                     '</methodCall>';

begin
  //TODO: XML Encoding
  Result:= XML_RPC(Format(SEARCH_SUBTITLES, [aToken, aSublanguageID, aQuery]));
end;

end.


Finally, I present you some sample calls:

Logging- in anonymously (empty credentials) and getting the token:

LogIn('', '', 'en', 'OS Test User Agent');

Logging- out (disposing the token):

LogOut('81nt6bgl9vde06l3ptq7v1a7r1');

Search English subtitles for the movie whose ImdbID is 120737

SearchSubtitles(Edit1.Text, 'eng', 120737);

Search English subtitles for The Lord of the Rings

SearchSubtitles(Edit1.Text, 'eng', 'The Lord of the Rings');


Search English subtitles for the movie whose hash is 7d9cd5def91c9432 and size is 735934464.

SearchSubtitles(Edit1.Text, 'eng', '7d9cd5def91c9432', 735934464);