Open source science

This is an opinion piece from the American Mathematical Society talking about the use of proprietary software in Mathematics.  The same argument could easily be made for proprietary software in the sciences in general. Interestingly enough one of the authors of the piece is also the man behind SAGE, which I am finally installing on my assigned computer at school today. Its also why I have been using Python with scipy and matplotlib as a replacement for Matlab recently. Its not that I believe I’m any smarter than the kind of people working at wolfram or mathworks (I’m not) but I like knowing that the algorithms I use are open to be reviewed by anyone with the skill and the time. Plus I’d prefer not to have all my work in a format I can’t use without pirating software or paying per-seat licensing.


The ‘Open XML formats’ seminar

I guess I should go into how I ended up talking about XML document formats and openness. On Monday the 13th I got into work(well, technically my contract is over and I’m finishing up some things I felt bad about leaving unfinished) and was called in by my (ex) boss who told me that she had received an invitation from the Ghana Standards board on the previous Friday to a seminar they were holding with Microsoft for all IT stakeholders in the country on ‘Open XML formats. This was of course basically an attempt to lobby our standards board to vote for ISO certification on OOXML. The fact that they started out trying to misname what they were doing didn’t make me particularly at ease.

By the time I was briefed, the job had already sent a letter back to the Standards board outlining the fact that there were issues other countries were raising about OOXML and pointing out that a discussion on open XML formats would be better served if Microsoft wasn’t the only group making presentations. Since they were more diplomatic than I have a tendency to be, they did not also mention the perception of conflict of interest the Board was inviting by co-hosting the seminar. I was told to read up as much as I could for the seminar, which was to be the next day.

Tuesday. I showed up a little late to the seminar, which was being held at the African Regency hotel. It was attended by

  • softTRIBE (a local software company)
  • AITI-KACE (the job)
  • Ministry of Communications
  • Ghana ICT Directorate (Their chairman headed the seminar)
  • The University of Ghana
  • Atlantic Computers (Mostly a vendor)
  • K-NET (ISP)
  • IPMC (IT school and vendor)
  • 3 Microsoft representatives, Atilla, Chineye  and Andreas Eberht (Microsoft Regional Technology Officer for the EU

From the beginning it was obvious that the Standards Board had become aware of the possibility of appearing dodgy and so clarified its position in a speech read by its representative that they were not there to take sides but instead to listen to the stakeholders (the above companies + others who didn’t show up) and carry their vote to the ISO. This was followed by a speech from the GICTED boss talking about the objectives of his outfit. Then we got to the actual talks.

    The three representatives went one at a time, Chineye first, then Atilla, then Andre. Chineye gave a fairly uninteresting talk about the benefits of OOXML to Ghana that managed to say almost nothing about OOXML but try to flatter us. The CEO of SoftTribe spoke briefly about the financial reasons for supporting OOXML and actually unwittingly made a great point. Atilla talked a bit more about OOXML, mostly using all the arguments that led to the creation of ODF in the first place, but pretending those arguments were from Microsoft side. The amusement it gave me listening to him explain about the evils of closed formats was worth the entire day.

    Andre was the real big gun though. Apparently he makes a living lobbying the EU for MS, so he’s very good at framing the argument to suit his company. Purely from a sophistry point of view, he was impressive. His presentation was split into several parts, most of which attempted to again make arguments that work just as well for ODF, but with more skill and finesse. There was nothing in there you won’t find by looking at the microsoft blogs on OOXML. Well, there was also a section on the advantages of XML based formats that separates content from formatting. Obviously he didn’t mention that this applies not only to OOXML, but to any kind of XML based document type, and also to LaTeX. Then we got lunch before the Q&A session.

    sidenote: setting all the talks before a big lunch and then getting back a bunch of satisfied,semi-sleepy people to ask questions to was a beautiful stroke, or maybe I’m just cynical.

    So, the Q&A section rolls around, I asked some questions and an attempt was made by the MS reps to paint me as ill-informed and obtaining all my information from blogs on the internet run by anti-Microsoft fundamentalists. Oh, and of course IBM was mentioned as the prime company lobbying everyone and providing them with groundless reasons to vote against OOXML. Then came the best tactic of the day. Dismissing my questions as ‘too academic’ and ‘concerned with the needs of other nations, not Ghana’. After I stopped being annoyed at the attempt to shut me down, I was highly amused.

    Anyway, at the end of the thing no real conclusions were reached as most of the people there had not read the specification, so another meeting was scheduled for Monday the 27th. That will be my next post.

    Observations:

    • A lot of IT stakeholders in Ghana are very connected to Microsoft it seems. I suspect I know why, but that is also a topic requiring its own post. It does taint the process though
    • After the seminar there was a storm of calls and emails to the Standards Board from IBM and a bunch of other sources. I didn’t have anything to do with that did I?
    • Microsoft’s people are slick , Especially Andreas, then again I get the feeling that defending Microsoft in the EU takes a substantial amount of sophistry. And that I can honestly respect.
    • To be fair to the standards board, they have made every effort to be neutral and keep the discussion purely on the merits of the proposal at hand. The problems with the process aren’t their faults really, the nature of our IT industry makes it easy to subvert

    More on this later


    A thousand pardons for my lateness

    I was supposed to have up a report on the OOXML seminar that happened last week. Sadly I have been spending all my time preparing for the follow up meeting which happens in an hour or so.

    This one will be more interesting. We have representatives from Microsoft, IBM and apparently a professor at the University of The Western Cape in South Africa. All the relevant stakeholders will also be there to listen and possibly vote. We are however going to require full disclosure of any affiliations they have with Microsoft before the meeting.

    Just so you know, the way it works is the standards board assembles a list of relevant stakeholders, sends them the specification and then lets them vote on it. I’ll have the list for you after the meeting. I should also point out that the standards board is doing their best to appear neutral because of a ton of pressure from Microsoft and its partners on one side, and IBM, civil society groups and apparently foreign standards boards on the other. It has also been strongly implied to me that they do not want to be seen as an impediment to a rumoured deal between Microsoft and the Government of Ghana. That can be bad for careers.

    Also worth noting, some of the Microsoft people who came here also went to Ethiopia to make the same pitch. This is definitely part of a concerted effort to lobby African governments. And to be perfectly honest it annoys the hell out of me.

    I’ll be there to speak and take notes on the event, so you’ll get both reports later this week. Maybe tonight if I have the energy to write them up.


    Background information on ODF, OOXML and why It matters

    In the beginning there was MS Office:

    Or at least there has been for a substantial amount of time now. Office has for a long time been the standard office suite almost everywhere, a fact that has made Microsoft a fairly massive amount of money. To ensure that they kept making money off Office, they adopted an unfortunately common practise known as vendor lock-in.

    What that means is that the information you type into a .doc file is only available to you if you have a copy of MS Office around that opens it. No one else is given the information on how to open those files. Hence,

    • Once you have created a bunch of .doc files, you are restricted to using Word to open them.
    • If someone else sends you information in a .doc file, you need word to be able to read/edit that information.
    • New versions of Office make subtle changes to how they save those .doc files so you are forced to upgrade if you want to read a file coming from someone with a newer version.
    • This is regardless of whether or not you actually need any feature the new version offers.

    The overall effect is to ensure that users are locked into a specific vendor’s tools and can’t switch to another vendor’s tools, which may be better or cheaper, without losing all of the information they currently have stored under the file format of the old tool. Even worse, in order to share documents, they will continuously be paying for the latest version of the tool.

    Now of course, once the maker of said tool controls a large enough chunk of the market, they have an assured revenue stream from simply releasing new versions of their software, and the consumer is screwed by the masses of old documents they have and their need to communicate with everyone else who uses this same tool and can’t abandon it either.

    Enter Sun Staroffice/Openoffice.org and ODF:

    In the late 90’s, Sun Microsystems bought an office suite called Staroffice. They decided to make the source code of Staroffice open and in doing so created Openoffice.org, a free, cross-platform office suite.

    Somewhere along the line it occurred to someone that there did not exist an open file format and the creation of an international standard for document types would pry the market open and allow several competitors to sell their office suites based solely on individual merit in place of vendor lock-in.

    Openoffice’s file format thus became the basis for an International Standards Organization(ISO) approved file format known as Open Document Format (ODF) which is currently controlled by an independent body called OASIS. Oasis is an international standards body that controls several different standardized file formats. Microsoft is listed as a member on their website though I was told by one of their employees that they recently left the group.

    ODF is an open standard that is supported in a number of different programs covering all major platforms. Anyone else who wants to implement it has access not only to the original specification but also to the source code of several of its current implementations. Its also a standard under rapid development to add in features that people feel are necessary in a document format. As a result of this there is a substantially reduced risk of vendor lock-in with ODF. Your documents will travel with you between office suites which support ODF. And any Office suite can add support since it is an open standard.

    At this very moment, is supported by Openoffice.org, Staroffice(free with Google pack), Koffice, Neooffice, Abiword, Gnumeric, Google Docs and Zoho Office among others. Therefore a document written on any of these programs can be read, written and edited on any other of these programs.

    A number of government agencies and businesses started to realize that switching to ODF would free them from vendor lock-in. A couple of them started to do just that and a lot of others are seriously looking at only recognizing open standards in the creation and processing of official documents.
    OOXML joins the party:

    Obviously the loss of several billion dollars of revenue maintained by an inability of their customers to open their documents anywhere else was not something microsoft was particularly in a hurry to see skip away. Hence, they took the document format of Office 2007 and submitted it to an international standards body called ECMA as Office Open XML(OOXML). All similarities in name to Openoffice.org are a mistake I’m sure *cough*. At the moment, the only full implementation of OOXML I know of is Office 2007. There are partial implementations out there, but nothing that is close to the level of Office’s support.

    Now, the initial ODF specification was 700 pages long. It took 3 years to get full OASIS approval and another year after that to get ISO approval. OOXML’s specification is a bit over 6000 pages. It passed ECMA in just about a year and is trying to get fast-tracked through the ISO, reducing the amount of time member countries have to examine the entire thing and come up with objections.

    Even with the limited time though, a lot of objections have been raised by people, companies and countries about part of the specification being vaguely written and depending on technology that Microsoft owns patents on and has not waived its right to sue over. The fear is that turning OOXML into an open standard will merely be a trojan horse to allow them their continued stranglehold on the market by means of an ‘open’ format which only they can fully impliment.

    In response they have apparently been sending PR teams around to national Standards boards all over the world(Ghana for a fact) to lobby for votes for OOXML under the guise of talking about ‘Open XML Standards’. On the other side there has been an effort spearheaded by IBM to make those same boards, some of which do not necessarily have the expertise to review 6000 pages of dense XML specifications in a month, aware of the existence of ODF and the technical objections that have been raised to OOXML.

    Why this matters in Ghana and the developing world:

    I keep getting asked this question a lot recently, so I’ll take a stab at ignoring the myopia it implies and answer it as best I can.

    Developing countries are still building the vast majority of their IT infrastructure. This means that they do not have a massive base of old documents in a restricted format. Those documents are on paper. Their offices are still being computerized. Their people are still learning how to use those computers. If you are going to teach someone to use an office suite anyway, what difference does it make if that suite is MS Office, Openoffice.org or Google Writer? What difference does it make if those legacy paper documents go to ODF or OOXML? Either way the work has to be done and the money has to be spent.

    The problem is, what happens when you lock yourself into a company’s proprietary format because they are giving you free stuff and claim the format is open, then they start charging you for it and you realize all those alternatives they assured you existed can’t fully open your documents and you are stuck with them and their licence fees?

    MS is spending a lot of money in Africa and giving a lot of stuff away for free. That altruism won’t last. It can’t, its too expensive. If OOXML is truly open(and what I’m seeing has me doubtful of that) then it doesn’t matter. When they start charging we can just evaluate our options and go in the direction that makes the most sense for us. If it doesn’t, we’ve spent a lot of money to build a foundation that renders us slaves to one company’s whims, and unlike richer parts of the world, we can’t come up with the money to change directions.

    If OOXML is inappropriately tied to Microsoft tools and software, it doesn’t fit the definition of an open standard and making it one is inviting trouble we don’t want and probably can’t recover from quickly.


    This should be fun

    Apparently tomorrow the Ghana Standards Board is jointly hosting a seminar with Microsoft about their OOXML document standard which the Standards Board will be voting on as an ISO spec.

    I’m going to leave alone the whole issue of how it looks for the standards board, which has a vote on the issue, to be hosting an MS sponsored event because I get to go. And I get to ask the MS presenter questions. And this being Ghana, I’m willing to bet money they do not expect an informed audience.

    *slightly feral grin, hears ‘Jaws’ theme playing faintly in the background*

    This should be FUN!


    Since people keep asking, the questions

    These are the questions we handed out for the most recent I2CAP competition.

    Note: These kids have had maybe 2-3 months practice by teachers who got a 3 day intensive training course. Just to put things in perspective. Points were also withheld for things like a lack of type checking. The winning teams all had running solutions for about 2-3 of the problems. One of them had solutions for all 4. They all had pretty much figured out the problems. The non-running solutions were due to tiny bugs and not flaws in logic. At the bottom end of the spectrum, some losing teams barely made it through one problem

    Question 1.

     

    As a new ruby programmer, your friend who has no ruby programming knowledge has approached you to help him with a program. This program should accept two values as input through the keyboard, and as output, produce the product of the two numbers (e.g. 6, 9 becomes 54 and 12,-20 becomes -240). The program should output an error message if a string or zero(0) is entered.

    5 marks

    Program Name: Product

    Example

    Input: 45, 10

    Output: The product of the two numbers is 450

    Question 2

     

    Write a ruby program that accepts integer X from the keyboard and use it to create an inverted triangle with X levels

     

    Example:

     

    Input: 5 Output:

    *****

    ****

    ***

    **

    *

     

    Question 3

     

    In the early days of Roman numerals, the Romans didn’t bother with any of this new-fangled subtraction “IX”. It was straight addition, biggest to smallest, so 9 was written “VIIII”, and so on. Write a method that, when passed an integer between 1 and 5000 (or so), returns a string containing the proper old-school Roman numeral. In other words, old roman numeral 4 should return “IIII”. Make sure to test your method on a bunch of different numbers. Hint: Use the integer division and modulus methods. 15 Marks

    For reference, these are the values of the letters used:

    I=1 V=5 X=10 L=50

    C=100 D=500 M=1000

     

    Example 1

    Program Name: Roman_Numerals

    Example 1

    Inputs:

    integer : 75

    Output: The roman numeral for 75 is LXXV

     

    Question 4

    As a new ruby programmer, write a ruby program that uses loops to convert from Ghana cedis (GH¢) to cedis (¢) and vice versa. The program should print out a menu with the following options:

     

    1. Ghana cedis (GH¢) to cedis (¢)

    2. Cedis(¢) to Ghana cedis (GH¢)

     

    Based on the option selected by the user, the program should accept a number from the user and perform the appropriate conversion.

     

     

    Example 1:

     

    Menu:

    Welcome to my Ghana cedis conversion program.

    Select an option from the menu below.

     

    1. Ghana Cedis to Cedis

    2. Cedis to Ghana Cedis

     

    Input:

    Select: 2

    Enter amount in Cedis: 200000

     

    Output:

    Amount in Ghana Cedis is : 2.00

     

     

     

    Example 2:

     

    Input:

    Select: 1

    Enter amount in Ghana Cedis: 450

     

    Output:

    Amount in Cedis is : 4500000

     


    Other thoughts/Questions

    1. I need a good Ruby IDE for both windows and Linux. Right now we use Freeride, which seems to be in neither the Debian or Ubuntu repositories. Plus its a bit on the clumsy side and seems to crash quite a bit, although I suspect we are using an older version. Still, suggestions are welcome.
    2. A good cheap Ruby book would also be useful. My suggestion has been that we write one covering the basics of the language with a ton of examples and problems for the Institute to freely distribute.
    3. The difference in power draw between CRT and LCT screens needs to be seriously considered when purchasing time comes around, a point I attempted to make about the Intel Iadvance systems when we first saw them(more on this later). While CRT’s are cheaper, the difference in power draw should matter a lot in a country which is beginning to subsidize power saving light bulbs and has huge energy issues right now.