iconfu iconfu iconfu is the world's largest collection of free, open-source icons and it comes with a handy image editor, so you can tweak icons to suit your needs exactly or even draw your own from scratch
invert blue resize move up verbose drawmode lighter previews remove swap animator library library draw shift up make a copy large editor explorer your icon contract agreement tag

15 February 2012

Mixing Landscape and Portrait rendering in a Wicked PDF document

Wicked is an awesome ruby library for generating PDF documents from plain old HTML/CSS.

HTML and CSS are not just another language; they provide a model for representing a document, and if you were obliged to use something else you would eventually end up representing your documents using this model, even if your syntax layer differed from HTML (if you're really smart, you would have ended up with HAML/SASS, for example; same model, different prettiness).

This is why other libraries fail (where "fail" means "I don't like them") - they oblige you to learn a whole new model for representing documents.

Anyway, the point is that you might believe from the Wicked README that you can create a document in landscape mode, or in portrait mode, but you're out of luck if you want both in the same doc.

It turns out you're not out of luck; Wicked, as its name non-obviously suggests, ultimately relies on WebKit (via wkhtmltopdf) to render html pages. With WebKit, you have access to a whole bunch of modern CSS properties, including those that rotate your document. You don't even care that they're WebKit specific, because you don't have to care about cross-browser support: you're using a known webkit version running on your own server which you control.

Here's the CSS:

.page {
  width:            195mm
  height:           270mm
  page-break-after: always
  overflow:         hidden
}

.page .landscape {
  position:                  relative
  margin:                    270mm 0
  width:                     270mm
  height:                    195mm
  -webkit-transform:         rotate(-90deg)
  -webkit-transform-origin:  0mm 0mm
}

The .page rules simply define an A4 page (after margins), and guarantee a page-break at the end of each page, just in case your printer didn't understand. The CSS assumes that you print in portrait by default. When you want landscape, nest a <landscape> element inside your <page>. Here's an example:

<page>
  This is the first page. It gets printed in portrait mode
</page>

<page>
  <landscape>
    This is the second page. It gets printed in landscape mode.
    You will have to twist your head to read it.
  </landscape>
</page>

Not so bad, no? Good luck!

06 February 2012

MySQL "LOAD DATA INFILE" with no primary key

I'm migrating a whole bunch of legacy data into mysql for a client. I export from their system, use sed to patch the nonconforming unparseable CSV it outputs, convert from macroman to UTF8, then tell MySQL to load it all into a series of tables from which the data will later be integrated into the standard system.

Today I realised the error of my ways: the column that I was using as a primary key isn't a primary key at all (at least not in the modern sense - of having a unique value - that we've become accustomed to since Codd invented the relational model in 1969).

I could have realised this earlier by just declaring it a primary key and watching MySQL spit at me, and therefore have fixed the problem sooner ... but no, who needs tests and constraints when assumptions are so much faster?

Enough about me ... the problem now is to assign each row in the exported data a unique id, and the easiest, obviousest way to do that is to include an id integer primary key auto_increment column in each table definition, let LOAD DATA INFILE etc etc shove the data into the table, and the primary key takes care of itself.

But it wasn't obvious how to make this work without warnings about mismatched columns. Here's the clever bit: what I did in the end was to add the id column after loading the data. In other words, I imported the data into a table with no id primary key column; and added that column afterwards. It's easy to script, it does exactly what I want, and it produces no spurious warnings. Perfecto!

Here's the concept in code:

> create table legacy_stuff(leg_col_1 varchar(255), leg_col_2 varchar(255), et_cetera_1 varchar(255), et_cetera_2 varchar(255));

> load data infile 'legacy_stuff.csv' into table legacy_stuff; -- plus all your favourite options;

> show warnings;

> alter table legacy_stuff add column id integer primary key auto_increment;

Good luck...

Update: for another way to look at this, see the stackoverflow page on this topic. It turns out you can use load data infile with the list of columns you want to import to, in which case MySQL will match the columns of your CSV to your specified column list. This way you can let your auto-increment primary key column simply manage itself. This solution does not suit my particular situation as I'm importing tables with hundreds of columns (they didn't believe in normalisation where this data comes from), and I don't want to have to maintain the column list in multiple places.

30 January 2012

ActiveRecord, I18n, Date, and UX

From a usability perspective, there's a lot to be said for calendar-oriented date input - the kind with a drop-down list for each of day, month, and year, or even a javascripty popup that makes a calendar with pix of lovely ladies) so that your users can simply point to the date they want without needing to think extensively. Also, it's not possible for them to write "LOL" or "thou villainous, clay-brained fustinarian" or any other reddit/4chan internet-meme-insult-joke kind of thing instead of an actual date.

On top of all those advantages, this approach totally avoids the issue of a certain country habitually writing down the month and the date the wrong way around.

From a frustratability perspective though, if your clients are people who use your system day in, day out because their business depends on it, you should probably consider them expert users, consider them willing to learn a particular date format so they can type it in a text box, because that's a helluva lot faster than picking your way through calendar boxes.

Ruby's Date class will parse from a wide variety of formats:


> Date.parse "december 15th, 1965"
 => Wed, 15 Dec 1965
> Date.parse "1st jan"
 => Sun, 01 Jan 2012
> Date.parse "1st jan 2038"
 => Fri, 01 Jan 2038
> Date.parse "1 jul 2000"
 => Sat, 01 Jul 2000
> Date.parse "jul 2nd 2000"
 => Sun, 02 Jul 2000
> Date.parse "27/2/2000"
 => Sun, 27 Feb 2000
> Date.parse "19-09-2009"
 => Sat, 19 Sep 2009

The only problem is when your users happen to be French. (Or German, or Swedish, or any of those crazy places that don't speak Ze English).


> Date.parse "19-fevrier-2013"
 => Thu, 19 Jan 2012
> Date.parse "19-mars-2013"
 => Tue, 19 Mar 2013
> Date.parse "19-avril-2013"
 => Thu, 19 Jan 2012

Something doesn't look right here. March works because (I suppose) Ruby looks at only the first three characters of the month token, so "mars" in French works out the same as "March" in English.

This is all a problem, because your expert users, impatiently bristling to get their hands on your finely-crafted app, are going to enter dates in a non-English language in a text field in a form on a web page that you are going to feed directly to an ActiveRecord object


def update
  @widget.update_attributes params[:widget]
end

where params[:widget][:expires] is "30 avr 2014" for example. Ouch. Internally, ActiveRecord calls Date._parse (as does Date.parse which we've been looking at above, so you can guess what ActiveRecord will do by looking at what Date.parse does).

So, what you were really looking for was a patch for Date._parse that will automagically convert those foreigner month names so that stuff works as it should.

As it happens, I've written a little gist that you can take home with you and enjoy. I keep it under config/initializers/date.rb. It monkey-patches Date._parse to gsub anything that might be a month, before forwarding to the original Ruby implementation, only if the current locale is not :en. It relies on translation strings that you provide via Rails' I18n library. Specifically, it looks up date.abbr_month_names, and date.month_names to construct a map from $local_month_name to $month_name_in_english that Ruby can deal with.

The code is in a github gist in case you can't see it here in your reader

28 November 2011

spk, or, rspec for lazies

My tests fail. Fortunately, they fail, otherwise I wouldn't be sure they were really working. Anyway, my full rspec suite is over 1600 examples now, and it takes about 90 seconds to run (including 20 seconds of overhead booting rails). After tweaking my code, I want to re-run only the failing tests, and I hate typing pathnames. Something I would also like to do, is re-run my specs but only a very narrow subset, without having to type in the entire path to the spec in question.

For example:

$ spk
......................................FFFF..F....FFF................................[etc]

/Users/conan/myproject/spec/controllers/admin/invoices_controller_spec.rb:96 Expected 'this', got 'that'

1 failed, etc etc

$ spk in/invoices_controller_s
spec/controllers/admin/invoices_controller_spec.rb

And the last line will run just the spec file that previously failed.

What's more, using rspec, I can easily choose a directory of specs to run, like this:

$ rspec spec/controllers/

but occasionally I would like to run only a specific subset of specs, for example invoice-related specs, because I've made big changes to my invoice class, and I want to check for ripples in neighbouring code:

$ rspec spec/controllers/admin/invoices_controller_spec.rb spec/models/invoice_spec.rb spec/views/admin/invoices/* spec/helpers/admin/invoices_helper_spec.rb

without having to type all that stuff.

In other words, I want to select a horizontal subset of my specs. For example, if I just want to run invoice-related specs:

$ spk invoice
spec/controllers/admin/invoices_controller_spec.rb
spec/models/invoice_spec.rb
spec/views/admin/invoices/index.html.haml_spec.rb
spec/views/admin/invoices/show.html.haml_spec.rb
spec/views/admin/invoices/_form.html.haml_spec.rb
spec/helpers/admin/invoices_helper_spec.rb

....................................................................................[etc]

The good news is: here's a wee script to do just that:

#!/bin/bash

if [ "$1" = "" ]
then
  time rspec spec
else
  SPK_FILES=`find spec -type f -name "*_spec.rb" | grep $1`
  echo $SPK_FILES
  time rspec $SPK_FILES  --format documentation
fi

Call it "spk" or whatever you want, put it in your path, chmod it 755, and use it as indicated above.

Enjoy!

12 November 2011

Lexmark considered evil

Right, you'll tell me, they make printers, what was I expecting? Well, I stumbled on this only today: I can't print a black/white-only document using my Lexmark S600 if the colour cartridges are low. There is no technical justification for this. Black/white printers have been printing black/white documents for many years without needing colour cartridges. A Lexmark S600 could do the same. But Lexmark clearly cares less about you than about its bank account.

I'd love to say, never buy a Lexmark again, but I don't know of another manufacturer that's less crappy. Suggestions welcome.

26 September 2011

Incestuous Sed, or, 's_..*_s/\\"&\\"/\\\\\\"&\\\\\\"/g_g'

I have a CSV file to import; I have no control over the producer of this file, and its output is unfortunately non-conforming; it encloses every non-numeric field in double quotes, and fails to escape double quotes within the field. In other words, I have something like this:

1,"Foo","A "Lord of the Rings" expert","Blah",123.45

While it is possible to imagine a parser that might be able to cope with this, in my case I'm importing into MySQL ("load data infile ...") and MySQL has no plans to accommodate this kind of CSV right now today. In order to import this, I need to transform it to the following:

1,"Foo","A \"Lord of the Rings\" expert","Blah",123.45

One solution is to detect quotes that are not part of the pattern /","/, but that gets tricky for first, last, and numeric columns. Given that the data is finite and changes slowly, I decided to write down the expressions that needed fixing, and write a sed expression to fix them.

To fix the above, all I need is

cat data.csv | sed 's/\"Lord\ of\ the\ Rings\"/\\\"Lord\ of\ the\ Rings\\\"/g' > clean-data.csv

But given a list of expressions to fix, I don't want to go the error-prone way of typing out all these sed commands line-by-line, making sure to escape all the spaces and other special chars, and counting backslashes. What can I use to transform these expressions into sed commands? Why, sed, of course! Here's how I transform a list of expressions into a list of corresponding quote-escaping sed commands, for use immediately afterwards in the same script:

cat quoted_terms.txt | sed -e 's/( |\?)/\\\&/g' -e 's_.*_s/\\"&\\"/\\\\\\"&\\\\\\"/g_g' > clean.sed
  cat data.csv | sed -f clean.sed > clean-data.csv

Yes, it's Backslash Hell!! The first line transforms this:

Lord of the Rings
The Canterbury Tales
Ulysses
Is Anybody Home?

into this:

s/\"Lord\ of\ the\ Rings\"/\\\"Lord\ of\ the\ Rings\\\"/g
s/\"The\ Canterbury\ Tales\"/\\\"The\ Canterbury\ Tales\\\"/g
s/\"Ulysses\"/\\\"Ulysses\\\"/g
s/\"Is\ Anybody\ Home\?\"/\\\"Is\ Anybody\ Home\?\\\"/g

Then the second line transforms this:

1,"Foo","A "Lord of the Rings" expert","Blah",123.45
2,"Bar","Read all of "The Canterbury Tales"","Blah",234.56
3,"Toto",""Ulysses"","Blah",345.67
4,"Titi","Asked "Is Anybody Home?"","Blah",456.78

into this:

1,"Foo","A \"Lord of the Rings\" expert","Blah",123.45
2,"Bar","Read all of \"The Canterbury Tales\"","Blah",234.56
3,"Toto","\"Ulysses\"","Blah",345.67
4,"Titi","Asked \"Is Anybody Home?\"","Blah",456.78

And voilà, clean csv all ready for import ... all thanks to the power of sed to mate with itself and generate more sed ...

Now, you can go and enjoy Sed - An Introduction and Tutorial by Bruce Barnett, because I'm not going to try explain all those backslashes

03 March 2011

Convert Your MYSQL Database from Latin-1 to UTF-8

It all started when I tried upgrading to ruby 1.9.2 and learned more than I ever wanted to know about character encodings. All of a sudden, my site was showing text humans were never supposed to read, with gibberish in place of recognisably foreign accented letters.

I tried using the mysql2 gem, and setting Encoding.default_external = 'UTF-8' in my environment.rb, these steps were necessary but not enough.

After much googling, it became evident that I had to go through each text field in each row in each table, and convert each latin-1 character to utf-8.

You would think that alter table #{table} convert to character set utf8 would do the trick, but no. You would be wrong. At least, I was.

Many authors have chimed in on this topic, but my hat goes off to Derek Sivers who showed the light in an O'Reilly article on converting latin1 to utf-8 in mysql.

I didn't want to do all the work he did, and figured a rails/activerecord migration might ease the pain somewhat. Below you'll find what I came up with. Re-use as you please. You'll need to specify the table/column names that need converting, and you might want to make sure I've covered all the characters that matter to you.

Basically, all this does is iterate over the tables and columns you specify, and then iterates over all the shady latin-1 characters you need to fix, and asks mysql to replace them with the utf-8 equivalent. Someone with stronger mysql-fu might find a cleverer way to do this; in the meantime, here goes:


# encoding: UTF-8

class ConvertMySqlLatin1ColumnsToUtf8 < ActiveRecord::Migration
  def self.up

    execute("set names utf8")

    # change this hash for your application. This example here is for a
    # totally original blog application concept.
    keys = {
        :authors   => %w{first_name last_name},
        :blogs => %w{name description},
        :entries => %w{title content tags},
        :comments => %w{content}
    }

    conversions = {
      'C383C2A1'         => 'á', 'C383C2A0'       => 'à', 'C383C2A4'       => 'ä', 'C383C2A2'   => 'â',
      'C383C2A9'         => 'é', 'C383C2A8'       => 'è', 'C383C2AB'       => 'ë', 'C383C2AA'   => 'ê',
      'C383C2AD'         => 'í', 'C383C2AC'       => 'ì', 'C383C2AF'       => 'ï', 'C383C2AE'   => 'î',
      'C383C2B3'         => 'ó', 'C383C2B2'       => 'ò', 'C383C2B6'       => 'ö', 'C383C2B4'   => 'ô',
      'C383C2BA'         => 'ú', 'C383C2B9'       => 'ù', 'C383C2BC'       => 'ü', 'C383C2BB'   => 'û',
      'C383C281'         => 'Á', 'C383E282AC'     => 'À', 'C383E2809E'     => 'Ä', 'C383E2809A' => 'Â',
      'C383E280B0'       => 'É', 'C383CB86'       => 'È', 'C383E280B9'     => 'Ë', 'C383C5A0'   => 'Ê',
      'C383C28D'         => 'Í', 'C383C592'       => 'Ì', 'C383C28F'       => 'Ï', 'C383C5BD'   => 'Î',
      'C383E2809C'       => 'Ó', 'C383E28099'     => 'Ò', 'C383E28093'     => 'Ö', 'C383E2809D' => 'Ô',
      'C383C5A1'         => 'Ú', 'C383E284A2'     => 'Ù', 'C383C593'       => 'Ü', 'C383E280BA' => 'Û',
      'C385C2B8'         => 'Ÿ', 'C385E2809C'     => 'œ', 'C383C2B8'       => 'ø', 'C383C2BF'   => 'ÿ',
      'C3A2E282ACC593'   => '“', 'C3A2E282ACC29D' => '”', 'C3A2E282ACCB9C' => '‘',
      'C3A2E282ACE284A2' => '’', 'C382C2AB'       => '«', 'C382C2BB'       => '»',
      'C383C2A5'         => 'å', 'C383E280A6'     => 'Å', 'C383C5B8'       => 'ß', 'C383E280A0' => 'Æ', 
      'C383C2A7'         => 'ç', 'C383E280A1'     => 'Ç', 'C383C2B1'       => 'ñ', 'C383E28098' => 'Ñ', 
      'C383C2A3'         => 'ã', 'C383C2B5'       => 'õ', 'C383C692'       => 'Ã', 'C383E280A2' => 'Õ'
    }

    keys.each { |table, columns|
      execute "alter table #{table} convert to character set utf8"
      columns.each { |column|
        conversions.each { |hex, utf8|
          execute("update #{table} set #{column} = replace(#{column}, unhex('#{hex}'), '#{utf8}') where #{column} regexp unhex('#{hex}');")
        }
      }
    }
  end

  def self.down
    # left as an exercise for the reader :)
  end
end

The # encoding comment at the beginning is important, don't leave it out or ruby 1.9.2 will complain.

Use this to check you've covered all the relevant text columns:


mysql> use information_schema
mysql> select table_name, column_name from columns where table_schema = '__YOUR_DB_NAME__' and (data_type = 'varchar' or data_type = 'text');

(There might be other relevant data types, like 'mediumtext' that you have to deal with; don't just take my word for it)

Here are some other places covering latin1/utf8 conversion:
http://stackoverflow.com/questions/4188677
http://groups.google.com/group/rails-i18n/browse_thread/thread/0146440dbf7d23ab
http://www.ruby-forum.com/topic/212452
http://stackoverflow.com/questions/4453842

Enjoy!