IMDB text classification: keras vs tensorflow 1.x vs tensorflow 2.0 vs pytorch

When trying to learn coding some deep learning problems, you can bump across quite a few (python) frameworks. Personally, I started with Keras, being the easiest, but then I was curious how others work.

So, as an exercise, I implemented the same problem in several frameworks trying to make it work exactly the same (as in take same input, give same output and implement the same kind of neural network inside).

The starting point for me with Deep Learning in general was quite a good introductory book “Deep Learning for Python” by Keras author Francois Chollet. Therefore, my Keras code follows instructions from the book, while the subsequent implementations in other frameworks I did myself trying to mimic what’s done already in Keras – small exercise in getting to know different APIs 🙂

I used the IMDB review classification problem (trying to figure out positive/negative reviews given their text). Implementing a simple MLP network with 2 hidden layers.

I tested the following, implementing the same IMDB problem in all:

All source code is available in Jupyter notebooks on GitHub, I ran it using python 3.7.

My general personal thoughts are that Keras is by far best for quick prototyping and doing many many experiments after which you might need to revisit some again (ie. code is simple and readable). At work, I often need to test many cases/architectures/configurations for a client, jump to another project and maybe few weeks later come back to my old experiments to continue or pick something for final deliverable.. so readability of code and being able to quickly recall what I actually did is nice.

Of course, I looked at other frameworks for a reason. Keras is high-level, so when I need more flexibility and control to implement something custom then might need to go elsewhere. The almost legacy now TensorFlow 1.x is truly a pain to debug, very unintuitive in comparison to any other python framework/library. PyTorch has a more interesting approach but I found it not as good documented and after learning both Keras and TensorFlow 1.x quite annoying actually :p Finally, TensorFlow 2.x is basically trying to steal the best tricks from everybody, fully integrated Keras and than for more control it adopted PyTorch-like approach with defining the model through subclassing etc. At the moment of writing this TF2 is still beta / release candidate, so I would say needs time to mature (e.g. I’m getting lots of warnings when running under Windows even the basic TF2 subclassing example, the documentation even tho improved I still find lacking etc.). Overall tho, I think its a good direction.

ps. my opinion could be a bit biased towards Keras because by the time I started with TensorFlow/PyTorch I already went through the entire Chollet book implementing everything and I used Keras for some problems at work with clients too.

Daily wallpaper tool for MacOS

For quite a while now Windows 8 and 10 had this neat feature to set Bing Daily Wallpaper as lock screen image or a wallpaper (via Bing Desktop app). Regardless of what might be the quality of Bing search engine, I think selection of those daily images is really well curated.

In the past, I’ve been using a bunch of shell scripts to get similar functionality in MacOS and there are also some paid apps available in AppStore. However, I wasn’t fully happy with either of those solutions so… I wrote my own Cocoa MacOS app to do the job!

Wallpaper Switcher works as a Preference Pane that integrates into the MacOS System Preferences. You can download it here, and simply open to install on your machine.

My intention was to make an easy-to-use, single click UI without the needed to mess around with shell scripts / command-line each time I reinstall or update MacOS. The app makes the entire process seamless and afterwards it just disappears into the background as if it was part of the OS.

Wallpaper Switcher uses MacOS native scheduler therefore it does not take any additional system resources, saving system memory and battery time if installed on a laptop. Primarily, I tested it with MacOS Mojave (10.14).

As a bonus, apart of Bing Daily Images, Wallpaper Switcher allows to use National Geographic Photo of the Day, images from Reddit posts (like the /r/wallpapers or /r/art that have a daily stream of community voted images), or just your own custom URL.

I’ve written the entire app using Objective-C and Apple’s Cocoa framework. Aside of the Preference Pane, the app has a small command-line tool embedded that does all the downloading and wallpaper setting in the background. To save resources, the daily updates are done using system built-in functionality of MacOS scheduler launchd. All this is setup and managed by the app based on user preferences set in the Preference Pane.

If anyone is interested how all this works under the hood, I published the entire source code and latest binaries on Github.

Conference spotlight: ICT 2010

I just got back from an EU event called ICT 2010 (held 27-29 September in Brussels). It’s run every two years and in fact it is more of a business event rather then a scientific conference.

The goal is to gather people connected to ICT innovation and research in Europe though participation in Europan Commission funding programs.

Throughout the first two days of the conference in which I participated there were a number general talks connected to fostering innovation in Europe (eg. quite interesting keynote & panel on “Driving societal change, opportunities for all“) but also focused panels and so-called Networking Sessions that were run in smaller groups of interest and on particular topics (eg. for one I attended Open Innovation session which unfortunately didn’t really prove that interesting for me). Apart of that large part of the venue was taken by exposition stands that were presenting the projects and research done with EU funding. The interesting part of this was the diversity, the projects showed that ICT research (with some cool outcomes) is being funded in really many areas.

Nevertheless, its worth to note that the key value of this event is not the talks but the opportunity to meet new contacts for future proposals and business. It was very evident that all the participants came there to take out as much value as possible for future funding opportunities. All the time pretty much everybody (over 1k participants) was hunting for new contacts that could later develop into consortiums for EU proposals (2.8 Billion eur for ICT R&D in 2011-2012). This was very clearly visible.

In conclusion the ICT 2010 was quite interesting and indeed a big event. Definitely not to miss for people interested in European research or collaboration with European researchers. From my personal perspective as a PhD student and a university researcher this was a nice opportunity to see what others do and I got some new contacts and collaborations for both academic research and writing/participating in proposals. Oh and an additional bonus: for students the registration is for free making this a truly great opportunity!

Idea Management Systems research – interesting links and articles compilation

Relating to my Idea Management research, I have put together a compilation of interesting articles and websites. Bear in mind that this is primarily done from the point of view of a researcher interested in data structures and mechanics of Idea Management Systems. The content below is byfar not complete and the choice is fully subjective. However, it should make an interesting read in terms of introduction to selected topics. To make browsing easier I aggregated all in categories and also sorted the links by my personal preference.

Do you know any other interesting resources ? Let me know in the comments !

note: please be advised that in a number of sections the criteria by which I list or order publications is subjective to my research interest which is: data, data and once again data in Idea management. (and also my personal likes or dislikes of certain publications)

note2: bibtex infos to come soon ™ 🙂

1. Idea Management Systems – vendor lists

2. Idea Management Systems & Innovation Management- research related to Semantic Web and Ontologies
2.1. Active projects:

  • GI2MO – shamelessly I link my project first ;). The goal of the project is to use Semantic Web technologies to solve a number of problems of Idea Management. The core and entry point is a proposal of an ontology for Idea Management Systems.
  • Idea Ontology – another interesting project with a similar goal to GI2MO. Different approach nonetheless equally interesting.

2.2. Selected Publications:

  • “A Model for Integration and Interlinking of Idea Management Systems”, Westerski et al. [bibtex]
  • “An Idea Ontology for Innovation Management.”, Riedl et al. [bibtex]
  • “Semantic innovation management across the extended enterprise.”, Ning et al. [bibtex]
  • “Innovation and Ontologies.”, Bullinger [bibtex]

3. Idea Management & Innovation Management – research or publications unrelated to Semantic Web
3.1. Idea Management Systems

  • “The A to Z of Idea Management.”, Schwartz [bibtex] – a book that attempts to describe in a systematic way how Idea Management Systems work
  • “Improving idea generation and idea management in order to better manage the fuzzy front end of innovation”, Glassman [bibtex] – a phd thesis focused on the first steps of idea life cycle
  • “Idea Management for organisational innovation”, Flynn et al. [bibtex]

3.2. Innovation Management (additional reading)

  • “Developing a software infrastructure to support systemic innovation through effective management.”, Dooley et al. [bibtex]
  • “Innovation Metrics and Measurment in 2009”, xxx et al. – an interesting publication about what innovation metrics are used companies
  • “Information technology support for the knowledge and social processes of innovation management”, Adamides et al.
  • ….

note: the literature and resources on the particular topic of Innovation Management is very rich, yet often it goes very far away from the specific area of Idea Management. Therefore, I listed only a few positions that can be loosely connected through their study being related to Computer Science or having some influence to Idea Management problems (e.g. innovation metrics paper).

4. Idea Management Systems – interesting news websites, blogs etc.

  • InnovationManagement.se“, online magazine, regularly updated with lots of interesting news posts. However, much broader scope then just Idea Management.
  • Idea Management Systems“, a blog run by Lauchlan Mackinnon. Focused on Idea Management by also posting some short articles on topics loosely connected to innovation.
  • Innovation Tools“, a portal with some info about Idea Management, including news and blog posts. Be advised tho, in my personal opinion the objectivity of some posts here is very doubtful and sometimes looks like sponsored by certain vendors (e.g. to support my claim, take a look at one of their recommended resources: “A Short History of Idea Management“)

5. Idea Management Systems – company blogs, newsletters etc.

note: be advised that the goal of Idea Management vendors is to make profit, so look at the objectivity of all the information with care and keep it mind that it not always might be the full picture what you read.

  • Spigit company blog, Fairly often updated with information not directly related to business but also innovation management theory in general.
  • Imaginatik company blog, contains both reports on company activity (e.g. case studies) as well as some articles on innovation with regard to its connection to Idea Management
  • Mark Turrell blog, a blog of CEO of Imaginatik. Not really that active anymore but contains a number of interesting posts on the topic of Idea Management.
  • Newsletters, contrary to what you might think, its not always only marketing and advertisement. Companies like BrightIdea or Accept sometimes run online presentations where they share their experience or invite some people to talk about innovation topics.

DEA marks two years spent in Spain (almost)!

This week I did my exam for ‘Diploma de Estudios Avanzados’ – short DEA. This marks roughly 2 years time spent in Spain doing my PhD. The “exam” is fairly simple, basically just a presentation about the research I did during the time and the courses I took. My presentation was mainly about the Semantic Web research I’ve done and the areas I apply it. You can check you the slides here.

I was talking about:

  • research I did on mashups during my first year here and the courses I took concurrently on the university (all this was rather a brief talk)
  • my current research on applying Semantic Web to Idea Management and my project (GI2MO) that will hopefully be the main pillar of my PhD (this was the main part of the presentation)

After being awarded with DEA the road to getting the PhD diploma is opened and in theory, if I had done enough research/publications, I might as well get it the day after (which obviously won’t happen ;p).

Hello (research) world!

Yep, it’s a fact. After quite some years of my engagement in various research projects, I’ve finally decided to run a research blog and publish information about various things I do or think about with relation to my scientific work.

Time will tell how sucessful I’ll be with this but I plan to update this space constantly and publish the progress/news of my work on PhD. Also most likely I’ll put here some ideas regarding my work, since lately I’m getting a lot of those and I figured it’s part of my work to share this too!

Welcome and enojoy.