Experiments with "other than work" stuff

The current domain I work in, Web development, isn't one of my favorites. I mean, just rendering HTML pages, and saving and retrieving values from database gets mundane after a while. There are things that you can learn, but I guess there is a limit to it. At least on the backend side. The frontend side is one of the evergreen fields of webdev. If you are a frontend developer, companies will line up for you. Not so much for backend developers. And the most important aspect of your domain should be, in my opinion, that you should be fascinated about it. Every day before going to work, you should be excited about stuff that you are going to tackle. This, which is something I value a lot, isn't really happening for me. I am not saying I am a top web developer, just that it does not fascinate me. I am sure many people can sympathize with me on this one. So what fascinates me? What makes me go crazy about?

A short and sweet answer would be ML and analytics. I just look at the number of ways it can be applied at my workplace, and I think, damn why has no one thought about this yet!? In my opinion, the current application that I work on has data that has potential to be a goldmine. And I am not sure people are aware of it. Just imagine if you can somehow predict how many tickets are going to be filed for a particular issue. You can then pre-allocate your resources to handle such loads. Imagine if you can provide solutions to an issue as it is being reported! Such insight can be invaluable in situations where turnaround time is critical. All this and more can be achieved by various machine learning techniques like Classification, Prediction, Natural Language Processing, and others. Once you get a taste for it, it is all you can see. I mean I keep thinking how prediction can be applied in scenarios we face daily. The possibilities are absolutely endless, as every video and article on machine learning seems to claim. Already we see recommendation systems in place everywhere. This is the sort of work I want to be doing.

Not that I know ML very well. I am just learning, and the taste is already intoxicating. I recently did something similar for a hackathon project at work. Most of you would be aware of Jira tickets. At work, we have something similar called RPD. We built an RPD Summmarizer, which summarizes the comments in an RPD and presents you with a short and simple summary so that you don’t need to read through tens of comments, most of which are useless. This is something novel for me and my team. And I wanted to do it at any cost. I could not pass on the chance to do something really interesting on company paid time :P (Just kidding!) So I looked up how summarization is done, how conversations are summarized, how articles are shortened, among other techniques. I was amazed by the number of ways there are to do such stuff. Even more impressive is the research that has been done on this topic, marking its importance as one of the top problems in Computer Science.

Here is the most magical part: all these problems can be mapped to basic computer science problems. For example, summarization can be mapped on to a simple graph problem, the shortest path from start to end. I was skeptical at first. I did express my skepticism to a colleague who is an ML engineer, and these problems are meat and drink for him. He explained how it all made sense: the first comment is the start node of this imaginary "graph". Similarly, the last comment was the end node. Each comment shares something in common: context. All comments are possibly talking about the same context. Each comment was now a node in the graph with connections between each other. What are the weights of the connections? Interesting question. If a comment is being referenced many times, then it is probably important right? I think the content of that comment is really important. And if two comments are similar, would you want them to be repeated twice in the summary? Definitely not. Along with other measures, we build this "graph". Now what is the summary? The shortest path across this graph! So intuitive, so logical, so marvelous. When I realized this, I was speechless for a while. That an NLP problem can be mapped to a graph one, and solved using its techniques, really gives a satisfaction of having learnt and applied all the concepts we learned in college.

This is just the beginning. I am going to continue the hackathon project, and see where it leads me. I was impressed with some of the summaries that were generated, but some were absolutely bogus. I need to figure out a way around such pitfalls. This is going to take up my weekends for sure. And I'll put my code on GitHub, so that I don’t lose track of it. I am actually very excited to do this, so excited that I haven't touched the Akbar novel for the past 5 days!

Ambarish's musings

Search This Blog

Experiments with "other than work" stuff

Comments

Post a Comment

Popular posts from this blog

Robin Hood Army

Nagarjuna Sagar

Movie review: English Vinglish