Yesterday we unintentionally inundated many of your phones with notifications. We messed up and owe you an explanation of what happened and what we are doing to ensure it doesn’t happen again.
On Thursday, May 19, at around 2:00 pm, the notifications team updated our fleet of notification servers with a code change. It took about 3 minutes for the deployment systems to update all our notification servers with the change.
Within minutes of the code-push, our users, including several employees started reporting that they were getting bombarded with notifications, unrelated to their interactions with Myntra. The team immediately stopped our notification systems and started to troubleshoot. However, within that short period, a lot of notifications had already been sent. We did cancel a lot of the notifications that were en-route, but unfortunately by then a lot of our customers had already received them.
Why it happened
After shutting down the notification sending systems, the team worked on reviewing why the error had occurred. We realized that the problem was not with the new code base - which had been tested independently - but with how it was deployed.
Notification systems require a set of “transformations” to a message before it is sent - for example, adding the recipient’s name inside the message, adding the list of users to whom the message should be sent etc. Each of these transformations is done by a set of processes which do their part and then put the message back in a queue for the next set of processes to take up. The new code had a “schema change” - the list of recipients was now expected to be in a new field called “userId” rather than “recipient”.
When we deployed our new code, there was a short period (2 min 37 sec) when the new code was active while notifications created by the older code were still being processed. This led to a “race condition” - the old code had already added the recipient in the old field (“recipients”) while the new code was expecting it in the new field (“userId”) and upon not finding a userId, left it blank. Defensive code should have been written for this case, but was missed.
Notification messages that went through the system during this intermediate state, became untargeted notifications (ie. did not have any userids). This resulted in them being broadcast to a very large set of users.
What we have learnt
There were several causes of the problem - a shortcoming in the deployment model of our notification systems - all incoming notifications should be stopped and older notifications should be flushed out before a new codebase is deployed. We should also have had more stringent checks in the code - notifications missing a recipient list should be held back and subject to more checks and the number of notifications that any user can get in a short time-period should be more stringently capped. Over the last 24 hours, we have already added some of these checks and will be adding more.
The last several hours have been a humbling period for us and we deeply regret the terrible customer experience this incident has caused. I am reaching out to the affected customers and apologizing for this error. We will strive to make it up to you by committing ourselves towards building a Myntra shopping experience that is truly wonderful.
Also, immediately, we will be reviewing all our systems and processes - and looking at our architecture as well as our deployment systems in great depth to check for any other such shortcomings.
[Note: My name’s Sunil Pai. I work as a UI architect in myntra. I wrote this as an internal post to myntra engineering, and they’re nice enough to let me share it with you. I’ve only edited some gramar and semantics. Giant disclaimer that I had no say in the company’s decision to go app-only, but read on for my opinion on it.]
I’d like to share some memories and thoughts on the desktop website, if only for posterity’s sake.
Early 2013, I was goofing around Bangalore trying to get my own startup off the ground. It was to be a tiny analytics tool for developers called ‘kolo’, that would hopefully make me enough money for food, cigarettes, and the occasional movie. As you can imagine, that didn’t go anywhere. By May, I’d lost most social skills, my facial hair made me look outright dangerous around children, and I was so. friggin’. bored. I used to hang on the weekends with a buddy who’d just started working at myntra, and he’d tell me about some of the problems facing the tech team. I told myself it’d be fun to maybe work on myntra’s problems for a while, until I knew what to do with kolo. I had an initial conversation with Shamik (holy shit, this guy got Lytro to market?), did the interview rounds, and got to work in June.
By this time, myntra had already decided to move to node.js. (I did a talk on the tech details of the move) Folks were already building dedicated (micro) services corresponding to the site’s features, so the only real job was to build the damn frontend. We did multiple projects at once. I remember while Gopi Raghu et al were doing the home/landing pages, I took on the search/product details page with Kunal Anand Sam et al, and we got cracking.
Those days were glorious. We were pumping out code so fast that we managed to fit in hundreds of extra goodies, and didn’t miss a single deadline. While originally to be just the ‘browser’ front end, we ended up writing the whole stack, all the way from servers and service clients and builds, to a highly tuned isomorphic stack for the browser app itself. The header rewrite happened in a weekend, and I lost count of the number of arguments with the designers. We set out to be the fastest ecommerce site in the country, and did that with room to spare. 6 months, thousands of lines of code, and unending conversations on how to do it ‘right’. Indeed, fixing this layer meant that the services finally got the load that they were built for, and exposed all the innards to true scale.
For perspective - we had 10+ php machines that would go down whenever we did a sale. We were struggling to do upwards of a crore a day during those times. When the node stack launched, we could run all of myntra’s traffic through ONE server. (theorized and tested by Rana, what a good day that was)
So we’ve grown like mad since then. The code we wrote in those days forms a big chunk of our networking layer now, and we’re easily India’s biggest node.js shop. We’ve written api gateways and mobile apis and microsites and whatnot. node’s the default front facing solution for webpages and apis (though there’s some movement around go/clojure that we should keep an eye on), and it’s fairly easy for newer engineers to jump into. All that code written on dark cold nights is still working, and constantly being fixed/tuned/replaced. The modular nature of node means we could move away from it in parts or completely whenever we choose, and overall it’s had a positive effect on the business. node does us well in these mobile times too, and that knowledge is turning out to be invaluable in writing actual mobile apps (more on that in a bit).
But for a while this last month, I’ve been feeling quite… sad. Purely sentimental, only because I’ve never really lasted long enough at a job to see my work being taken down. That project gave me personal validation, and made me learn so much. 2 years I’ve lasted at myntra! (I even tried quitting, but was thankfully talked off that ledge) I’ve received the occasional tweet abusing me about the site shutdown, telling me how much they loved the website ux, leaving me quite nonplussed and unsure how to respond. I’ve been uncertain about my own role in this company, and whether this paradigm shift would obsolete me.
I showed up for the 14th midnight shutdown event at office… not sure why. I suppose I had to show up for the funeral. Oddest feeling ever, doing this. Prasad was really sweet and called me up to push Anurodh’s finger to shut down the site, and we broke out the booze. After 2 glasses of champagne, and a day of thinking it through, I think I have a much better perspective on what this shutdown means to me.
For one, I’m not done as a web developer. I’ve barely started.
Read my lips: HTML/CSS/JS != Internet.
The real innovation in the mobile world has come from seamlessly connecting devices and services and people in real time with tiny electronic devices that everybody carries with them. And the scale of these devices is many multiples of what desktop will ever be. If you take that, and piggyback on the ubiquity of the Internet, magical things happen. Traditional Ecommerce is just the tip of a very, very big iceberg.
Read my lips: Mobile browsers != Mobile internet.
While the word ‘apps’ makes it sound like these are desktop style apps, with tiny closed domains as whole worlds onto themselves, the mobile scene is a little different. It’s getting incredibly easy to talk between apps and services, and smaller screens means better, focused UX (at least from companies that give a shit :P) And I’d talk your ears off about how streaming apis/observables + IOT will literally change the world. Very exciting. By having access to a platform that’s so… ‘connected’ with human beings, I repeat, magical things happen. Understand - THIS is how we reinvent the web for the mobile world. And this is how we will delight our customers. What happens over the next 2 (3?) years of combined effort across the world will define what the mobile internet is. And while we’re at it…
Read my lips: Always bet on JS
But also, for this revolution, I actually get to participate. The desktop revolution happened when I was a child, but this time I’m in the thick of it. Heck, my employer just threw down the gauntlet to the world, with a bold stride towards this future. And my work experience as an async programming / user interface developer means I have the skills required to write programs for this world. On a kickass platform, on my own terms! This feels so empowering, I can’t stop smiling.
Secondly, I get to do this with people that I genuinely like.
Ecommerce fundamentally isn’t easy, because your users trust you with their money and personal details. This is an incredible privilege that we must never EVER abuse, and means we must always be sympathetic to their problems, even if they happen to be shouting at you loudly when it happens. What this also means, is that the team that runs this business have to operate with the utmost diligence and standards. Remember, site-stays-up is, and always will be, the engineering team’s first priority. When our site goes down, India gets MAD.
In the process of being in such a team, you get close to each other. You see the ugly sides, the personal quirks. You wind up at their weddings, and drunk late at night in the middle of bangalore arguing about unit tests and PRDs. You have little to hide, because being ‘on the go’ all the time means these people truly become your friends. Some days are great, when we do launches and hit targets, and some days are bad, when we don’t.
The only reason we got through such a giant tech stack rewrite, was because everybody bought in, and did everything in their power to make it work. There are few teams in the world who can pull off this kind of multi-month effort, and I’ll always be proud of the way myntra behaved during the transition. Our culture let us achieve many big things, while the actual tech was just an implementation detail :) What I think that means, is that culture will be a key factor again during this mobile phase. It’ll be challenging considering all the new faces, but I have faith we’ll do it right.
So yeah, exciting times; for me, and for myntra. Many things will change - the way we write code, the way we deploy to these devices, the way we test our services (imagine being DDOSed by every mobile device on the planet), what ‘design’ means (do designers still use illustrator to mock tiny dynamic screens?), so called “web standards”… all of it. We already have a number of internal experiments progressing well, and I can’t wait to see how our customers react.
To wind this up, I’d like to make one last mention of the website. It struck me that no other website ends on a high, and most of them close because of failure. For that reason, myntra.com was the Sachin Tendulkar of websites. Sure, in his last few days, he got outshone by his younger teammates. But he also got a standing ovation when he left the field, if only for paving the way for the future.
We did an open Hackday, and we called it Hackerramp!
Hackerramp, Myntra’s first open hackathon happened on the weekend of April 25-26 2015. Thanks to all the 180 participants who made this event a grand success. We had around 35 teams presenting their amazing hacks to the Jury.
The goal was simple - make the next big thing in mobile. Be it an app, an API or even something related to mobile security. For more than 24 hours, keeping themselves fuelled on Red Bull, the teams raced the clock and converted their ideas into real working demos.
Myntra had done a Hackday in the past, but this was the first time we opened to everyone, which meant we had to get this right. There was a lot of preparation put in. There had to be great food throughout the day, great talks to kick off the event and some great goodies for the participants to take home!
We had three talks:
Shamik Sharma spoke about the new possibilities coming out of the mobile emergence.
Punit Soni spoke about his move from US to India and his experience at Google and Motorola.
Sunil Pai spoke about recent trends and changes in programming - introducing Reactive Programming to the audience.
We had an amazing panel judge the teams (from left to right):
Amit Somani - Ex CPO, MakeMyTrip
Amod Malviya - CTO, Flipkart
Shamik Sharma - CTO, Myntra
Pramod Varma - Chief Architect, Unique Identification Authority of India
When you have 35 highly-driven teams competing for a prize bucket of 3,00,000 rupees, you know they Jury is going to have a hard time deciding. So hard, that they actually announced four winners!
Chat based commerce (Daredevils): Himadri, Jatin, Vijay amd Mohan from Hike. They are part of Hike’s iOS team. Prize: Rs. 1,50,000.
Chat with friends realtime to buy products (Fashion Friends): Nikhil Bansal, Anurag Saxena, Shivam Gupta & Neera Singh. They are from Flipkart, Marketshare, Vizury and Parcelled. Prize: Rs. 75,000.
Augmented Reality Shooting Game (Massacre): Nilesh Hiray and Naveen Reddy from Myntra. Prize: Rs. 50,000.
Smart Links (The app that links all other apps): Niketh and Niteesh. They are from Yahoo ad Fungru. Prize: Rs. 25,000.
Introduced in iOS 6, UICollectionView is a more powerful extension of UITableView. While a table view only allows you to layout rows vertically - the number of different interactions that can be implemented with a collection view are huge. You could make a carousel, a gallery or even a circular layout of objects. And its all thanks to the fact that you can create your own custom layout logic for the items in the collection.
Using the UICollectionViewFlowLayout, the attributes for each item in the collection view can be be specified manually. So you could position them in a circle, or you could tilt them sideways as the collection view scrolls - achieving a carousel effect. When we were developing the Myntra iOS app, we needed a way to have a sticky header for the search results. This header would then show prominent Sort and Filter buttons.
The sort/filter bar at the top is a section header in the collection of search results. So when you scroll down the results, the bar remains sticky at the top of the view. A quick search pointed us to CSStickyHeaderFlowLayout which is a simple flow layout subclass to achieve sticky headers.
But soon we needed sticky footers too. The idea was to make the Buy Now bar on the product page to stick to the bottom of the screen when you scrolled through the details of the product.
The output of this effort is MYNStickyFlowLayout, a drop-in collection view layout class, that gives you sticky headers and footers for collection views. Let’s dive into the code and understand what is going on.
Let’s walk through what it takes to implement a custom layout for UICollectionView. First you subclass UICollectionViewFlowLayout, because that is an excellent starting point. The most important method to implement is - (NSArray *)layoutAttributesForElementsInRect:(CGRect)rect. This is called by UICollectionView everytime it needs to display cells on the screen. It does this even before the cells are created. This is because the output of this method determines which cells are on screen and hence need to be displayed.
The view calls this method and expects an array of UICollectionViewLayoutAttribute objects in return. These objects have properties like frame, transform, alpha etc. that you can configure. In most cases, you probably want to start with the attributes generated by the parent class - UICollectionViewFlowLayout. Do this by calling [super layoutAttributesForElementsInRect:rect] and storing it in a mutable array because we are going to be making changes.
In our case, we iterate over all these attributes and pick out the headers, the footers, and the first and the last cells in each of the sections. Then we iterate over the first cells in each section. This is because when you scroll down, the first cell of the section will take the footer sticking to the bottom along with it as it scrolls out the bottom of the screen. Similar logic for the headers and the last cells in each of the sections.
Now, because the collection view assumes that the header/footer is not on the screen, the layout attributes that we got from the parent class might not contain the layout attributes for the header/footer and hence the view will not create them, and hence they won’t show on screen. So if we don’t find them, then we quickly add them to the array.
Also while we are at it, we position them based on the position of the first/last cells. The final effect is that as the section scrolls out of view, it takes it’s header or footer along with it.
Everytime the collection view scrolls, it recalculates these attributes. And finds the best position for the section headers and footers. Thanks to all of this math, we finally get a sticky footer for the product details screen.
What you see are two sections, and the first section has the Buy Now bar as a section footer. Hence it remains on screen while you are viewing the first section.
So that wraps up this post! Head over to MYNStickyFlowLayout on Github and try it out with the included example Xcode project. It is avaliable as a CocoaPod with the same name - MYNSticklyFlowLayout. Read the documentation on the repository to learn how you can use it in your project. Good luck!