Peter is Google’s Product Manager for Image Search. Prior to Google Peter was at Ask.com working on Search Quality. Prior to Ask Peter worked as an engineer at Oracle where he enabled regular expression support in SQL and developed a migration solution to convert databases to Unicode, both equally challenging and rewarding projects. Peter also worked with Jonathan Gennick and came up with a nifty book on Oracle Regular Expressions.
Interview Transcript
Eric Enge: What are some of the basic on page things that you can do to optimize your image search results?
Peter Linsley: There are a lot of best practices that we can touch upon, but I think I’d like to start off by describing the problem that image search engines in particular have. If you think about most web documents, they are to a certain extent structured data that we can crawl easily. The title and body of the page communicate a lot to the search engine. Think of a Wikipedia article titled “History of the United States,” which describes very accurately what you are about to read in the body of the page. The other thing that is key with web search is that they have things like backlinks and anchor text where you can get a read on what other people say about something in particular, in this case, a Wikipedia page.
When it comes to images, those signals are not always available. The crawler will have to look through the HTML, and it will find an image source tag. That’s pretty much all it knows about the image except for the alt text, which is something you can put into the image tag. The best thing to do with that is to describe what is in the image as efficiently as possible.
This has value in a lot of ways. I was on a local connection the other day, and as I was looking at a page for about 3 or 4 seconds as an image loaded up, and all I could see was the alt text. If you have images turned off or if you are visually impaired, or are using a text browser like the Lynx browser, the alt text is extremely valuable. In addition, if you hover your mouse over the image on most browsers, you can see the alt text.
We like it because users can see it and it brings them value. I think a good practice is to use the alt text to describe what you can see in the image, and you can use that same text elsewhere on the page. It could be the title of the image or part of a caption. Getting back to how a crawler looks through a page, we see this image tag and we can see the alt text, but of course it doesn’t say everything about the image that it possibly can. In Flickr’s case, people will highlight parts of the image, and say something about a particular part of the image, but none of this is structured, and it’s not really machine readable.
We have to guess and also look at what text we believe can be attributed with that particular image on the HTML page. Another good, simple practice is making the title, description or caption of the image obvious to the user. Hopefully, we’ll be able to figure out which text is associated with that image from an algorithmic point of view, and rank it accordingly.
One other good practice we can talk about is using the filename to label the image as well. If you include an image in multiple places on your site, or if other people happen to be including it on their sites, it’s an attribute that’s attached to the image wherever it goes. We certainly look at that too. There are a couple of things to note about the filename.
One thing that comes to mind is that a lot of operating systems or web servers do not allow you to use other languages that cannot be represented in ASCII. There are operating systems and web servers where you cannot use certain types of text in the filename. So for a search engine to look at the filename and treat it as the best description of the image would be a mistake. There are also some people who can’t accurately describe what’s in the image with the filename in their own language, and that’s one reason why we may or may not consider it a very strong signal as part of the ranking process.
Eric Enge: So for the alt text would something like a “picture of Charlie Chaplin dancing in the moonlight” too long?
Peter Linsley: No, I wouldn’t say that was too long at all, and it’s very descriptive. If you can’t see the image, you can imagine what the image would be, and that’s really the whole point with the alt text, you could consider it a replacement for the cases where you can’t see the image. That sounds like a perfect example, it’s very descriptive of what you would see in the image if you could see it. That’s the sort of thing that I would definitely encourage.
Eric Enge: When does it become too long?
Peter Linsley: Obviously there is no hard and fast rule. I would just think of how the user would feel about it. I think if it’s a very long description of all the details of the image, it’s probably something that would be more useful on the webpage itself. A good rule of thumb is just to say, here is what’s in the image, and then you can put a title, caption, and description elsewhere on the page. I would treat it as an image title, and if you think about the title of an HTML page, I would give the same sort of treatment.
Eric Enge: Along those lines, you mentioned the value of text like captions where your crawl will attempt to correctly associate with the image. Of course, that is general nearby text, but I think as you look a layer up you have just a general context of the page.
In my example with Charlie Chaplin dancing in the moonlight, if the page is all about Charlie Chaplin and that’s in the title and discussed in the article, then I would imagine all of this helps with the classification of what the image is about.
Peter Linsley: Absolutely, there is no doubt about that. It does to the extent to which the image itself is very important to the page, and if the image was no longer on that page, it would lose the a significant amount of its utility.
It’s a very strong signal obviously, and it means we can start to take the context of the entire page into account.
If you have a large image above the fold where you can see enough detail of the site, and you give it a very clear title and the page and image are clearly related to one another, this is all a very strong signal. The key here is to think of the user. When they land on your page, they know at a glance what this page is about and the image fits in as an important component of the page itself.
Eric Enge: Let’s say you had a webpage with an article with 10 paragraphs and it covers 6 different subtopics all related to one topic. If you have an image related to each subtopic, there is definitely context matched with the content on the page, but it’s not like the focus of the page.
Peter Linsley: Yes. There are a couple of good things you can do here. Let’s take something like a blog category page, where you go to My-blog.com/category/san-francisco or something like that, where you are seeing every blog post that happens to be tied with San Francisco on one page, but each post talks about different things. We usually are pretty good at figuring out what one section of the page is about and what image is associated with it.
Then we find another section of the page and the content that can be associated with the particular image that it’s using. Another good practice is having a permalink for each and every entry. If you think about the blog category page example, in a lot of content management or blog publishing systems, the title of the blog post itself would typically be a link to a permalink page for that particular article. A good way of looking at it is if you do have links to the particular section that can really help us figure out what the canonical URL is for that image.
There is no hard and fast rule. It’s something I wouldn’t worry about too much if it’s intuitive to your end user. The goal for us is to try and figure out how to interpret that, and figure out which content is associated with which image.
Eric Enge: One of the interesting challenges you’ve outlined here is that you can’t fully parse what the image is by looking at the image file itself. A lot of what we have talked about relates to developing a level of confidence as to what the image is about, so as your confidence that an image is about a certain topic increases, is that a positive ranking signal?
Peter Linsley: Absolutely, yes. There are a number of ways we try to figure out what an image is about, what the content is and whether it would match the intent of a search on our site. You’ve touched on one of the most fundamental problems, which is that machines find it difficult to read an image and know what it is trying to represent. Let’s say I was out in San Francisco for the weekend, and I snapped a photograph of a shark jumping over the Golden Gate Bridge or something ridiculous like that.
There is not much I have to do to tell the readers of my site what’s going on here. I could just have a simple title that says wow, check this out, and then have the image there. The image will speak volumes, but there is nothing available for a search engine from a crawler or a machine point of view to be able to figure out what’s going on there unless you actually start to look at the pixels of the image itself. So it is certainly all about our confidence as to how strongly we believe that we’ve figured out what’s going on with the image.
Eric Enge: In the HTML world I always refer to this as having two dimensions. One is the relevance, and the other is importance, which relates to signals like links. In the image world, you can envision that relevance and importance would certainly still be factors, but now you have the additional factor of confidence. It’s a slightly different model, and because there are so many signals in the HTML world, that confidence is usually pretty high.
Peter Linsley: That’s right. People linking to your page, that’s a vote for that page from an external source, but people don’t typically link directly to your image, so it’s really up to us to figure out when those signals are talking about the image and when they are not, and factor that into the algorithms.
Eric Enge: How does the importance of the webpage influence the ranking of images and image search on the webpage?
Peter Linsley: It certainly is a signal that we use. PageRank is one of many signals that we consider, where people are just generally interested in that page, what it is talking about and how much of an authority it is. The value of a webpage can speak volumes about the images that it includes, so when we talk about image search from an SEO perspective, one of the best things we can say is all of the rules for web search apply. The goal here is to create a site that has a lot of unique and compelling content.
The extent to which you are the authority on a particular subject and can talk about it in a compelling manner will certainly start to provide a lot of signals that benefit the image as well.
Eric Enge: So you have a page with really great content that has been rewarded with links from all over the web and has other positive external signals associated with it, and the benefit of those signals accrues to the images on the page.
Peter Linsley: It certainly is part of the equation, yes.
Eric Enge: If you have a page with one, dominant image on it above the fold, I imagine that that gets a bit more attention then when you have 25 images scattered around the page.
Peter Linsley: Our general belief is that we want to return pages that are going to be useful to the user. That’s not to say that we are analyzing to see if images are above the fold and so on and so forth. It’s more to focus on things that provide a compelling and interesting experience for our users, and hopefully, a lot of the signals that we look at will start to point in the direction of this particular page by virtue of doing this. If you have an image-centric site, let’s say you’re a photoblogger or you run your own little stock photography site, it’s a much better practice to bring the users’ attention to that image immediately as they land on the page. Because we believe it will be a good user experience, we are more than happy to send our users to those kinds of pages. It’s more about focusing on the user experience.
We try to make sure that we return images for each query that are the best possible images out there. You can imagine there are plenty of pages that have many images on the page and are perfectly relevant to show for certain queries.
Eric Enge: If you have a page where the external links and the page title all match up and appear to be centered around on one picture, that’s a lot of on-page focus.
Peter Linsley: Definitely, and it’s a very good user experience to boot.
Eric Enge: Speaking of ways to get more data on what images are about, one of the more interesting things is the image labeler game. Can you talk about that a little bit?
Peter Linsley: The inventor of the concept is Luis von Ahn, and there is a video of him discussing it that is really interesting to watch. The basic idea here is that there isn’t a whole lot of structured data around images on the web. For a lot of people, it’s a pain to label images. If you think about your own photo collection, I know I have tens of thousands of pictures in my photo collection, and I simply don’t have the time to go about describing them. This obviously makes it very hard to search for images that I took 3 or 4 years ago.
The basic concept here is to label images in a more effective manner. The idea is to present participants with an image and to also pair you up with somebody else online. You are not aware of each other, you are both looking at the same image and you have to start typing what you see essentially. Let’s say you and I were playing this game and we saw a picture of a shark jumping over the Golden Gate Bridge, we’d start to type in things like Golden Gate Bridge and shark and so on. The better your tags match up the more points you get.
You can imagine the net result here is you end up getting a lot of relevant tags because the chances of us matching on something like Golden Gate Park if it is not relevant to the image are very, very low. If you could do this in a very scheduled manner and get a whole lot of images tagged this way, it’s something you can imagine would be very useful for the search engine. The game is out there, and it’s a whole lot of fun. If you haven’t tried it, give it a shot. Without getting too much into the specifics, I guess I can say that the data we’ve got from this has been very interesting, and it’s taught us a lot about how we can improve our search quality and results for our end users.
Eric Enge: Just to elaborate on the game a little bit more, basically the closer the words that you associate with a picture are to someone else’s description, the more points you get, correct?
Peter Linsley: That’s right. If you look at the leader board, you’d be amazed by the number of points that some people have accumulated over the years.
Eric Enge: You also make a special note of the people who earn the most points in a day as well, right?
Peter Linsley: I believe that’s presented on the site, yes.
Eric Enge: The other interesting thing is that you get paired with a different person for each image, so you don’t develop this symbiotic pattern.
Peter Linsley: That’s right, yes. Part of the goal is to make sure you are getting a fair sample, so to speak. The other part of the game is that you can’t communicate to this other player by making up tags. You’ll only see the tags when you both match up, so it’s a really interesting concept. It’s really interesting to watch the original video of Luis von Ahn as he demonstrates it.
Eric Enge: You said it has also taught you a lot about the world of images and how to evaluate that data, so in your mind can you say that it’s had a direct positive impact on image search quality?
Peter Linsley: Without getting into specifics, it has taught us a lot and given us a lot of useful data.
Eric Enge: A couple of years back we witnessed the advent of universal search, which was a big thing. One of the things that’s been very prominent with that is images being served up in the regular search results. Over the years I have also seen that the amount of that integration has increased. You used to be able to type in something photography related, and you would occasionally get images, but now you get them much more frequently.
You can also get images where you just infer what people really mean if they want a picture, a photo or an image. How has this impacted image search from a couple of different perspectives. – One is has there been a huge increase in people getting image results that they acted on? The other is, has it done a lot for traffic at original http://images.google.com?
Peter Linsley: Just to give a little bit of background on the motivation there, while we do have an image search property, we found that there are a number of queries being received on web that either had direct image intent, much like the ones you described, or they would be best answered by an image. You can think of various examples for this, like we might infer somehow that somebody typing in Empire State Building is purely interested in seeing what it looks like, and maybe they wanted to visit a site that had a lot of pictures related to their search topic as opposed to a site that wasn’t very picture or image-rich.
The idea is that if we believe an image would be a good answer for a particular query on the web, then we will just show them images. Image search provides the results for a universal search when images are shown, and I think it’s fair enough to say that it really does provide a lot of exposure, given that it is shown for a significant number of queries. A lot of people are maybe not aware that we have image search, even though it’s shown in the tabs across the top.
It’s second to web search in size, which makes it a really huge property, so it’s a really good way of exposing that Google is searching content from all sorts of different verticals across the web.
Certainly by virtue of showing this content, we have provided the user with more options and increased the chance they will find what they want quickly. Users are given two options when they see those units popping up, one of which is to click directly on a thumbnail and then get taken directly to the site that contains that image. Or they can click on a little link and choose to do the same query on image search. That then gives the user an option to focus on a particular image if they chose to do so.
So there are certainly cases where it’s bringing traffic into images.google.com, and hopefully, we are satisfying our user’s intent. If they end up figuring out their question could be answered directly through images. google.com, then that is fine too. The other case, of course, is queries where there are images that answer the question directly. This could be a query like “picture of Mount Rushmore.”
Eric Enge: I understand that the results that you show in universal search aren’t necessarily the ones that you show in an image search?
Peter Linsley: That’s true, that can happen. We believe the expectations are ever so slightly different between somebody doing a query on a web search and somebody doing an image search. When they are performing an image search at images.google.com it is clearly their intent to get images in response, which is not the case with web search. But there are other aspects of how their intent may differ based on the web property they use to perform the search.
Eric Enge: At this point, you probably have significantly more images served up in web search than in image search.
Peter Linsley: Well, I think they offer two very different experiences. One thing we found is that for a lot of queries on image search people like to see a lot of images. Another thing is that a lot of queries are very subjective in nature. You might do a query for something like waterfalls, and you have in mind the kind of waterfalls that you want to see or the kind of site that you want to navigate to, but it’s very difficult for a search engine to know exactly what you are after ahead of time, which is essentially the goal of web search of course.
So there are cases where 3 or 4 images just don’t cut it, but universal search offers you that ability to dive into that image-centric experience, where you can jump right into the property and you can page through hundreds and hundreds of images.
People can consume image snippets at a much faster rate than web results, where you have to click through and evaluate each site and its content, so I think they complement each other very well. There are queries on web search where we believe users might be interested in seeing images as the answer to their question, but it also offers the ability to dive in and have this very image-centric experience in the property.
Eric Enge: Now what about the bane of all search, which is spam? What sort of issues do you face with spam in image search?
Peter Linsley: For the most part, image search can inherit quite a lot of the work from the webspam team at Google, who do an incredible job of identifying pages that are not really in the best interest of our users, and taking appropriate actions. That’s something that we inherit directly at image search. We know we are associated with a particular webpage because that’s where we take you after you click on the result. For the most part, I think we inherit that, meaning the best practices that you can read on the website and talk about on the web all pretty much apply to image search.
The other thing to add to this is that it’s very easy for the average consumer to go and get a nice camera and take unique content, so it’s quite easy for you to go out and create some unique content. We believe that if your motivation is to get traffic then just put out some unique content, then it shouldn’t be that difficult.
Eric Enge: Let’s talk a little bit about things that are coming in the future. One of the obvious interesting things is scanning and extracting information directly from the image, and you can talk about facial recognition software or optical character recognition software, and various kinds of tactics. But I want to get your view on what’s exciting in that area and what kind of things you think could happen?
Peter Linsley: Image search is certainly a really interesting property, and it’s growing very rapidly. But, more importantly, so is the world of online images, especially as it becomes easier for your average user to take photographs. A digital SLR camera would have cost me thousands of dollars several years ago, but now it is much, much cheaper, so it’s becoming much easier for your average user to get hold of cameras that take really high-quality images. The cameras that are in cell phones are becoming better and better, which translates into a whole lot of really nice, unique online content, and it is our goal to organize and present that content to users when we believe it’s the best possible image for that query. This is just absolutely exploding. Not too far in the future, people will be able to take a photograph and their wireless SD card will simply replace upload their photos directly to the web and publish them instantaneously. Pretty much every image you take is unique. You can imagine the world of images is just absolutely exploding right now, and right now we look at the hundreds of billions of images out there that we are trying to organize and index.
You can imagine it becoming trillions of images in the not too distant future. Then the question, of course, will be how can we organize all of this content. So a lot of our focus is on where we think this industry is going, and where we think the area is going. Certainly you can imagine the amount of effort it takes to write up a nice webpage and put tags, a title and all your alt text on it, but most users are lazy and they just don’t really have time to do this. We are very, very interested and excited about the future of computer vision, visual search as we call it, which involves looking at the pixels of the image and trying to figure out what’s going on, and trying to associate that with the user’s intent. It is definitely an area we are very interested in. On the flip side, you can also see a slightly broken paradigm. If somebody has an intent to see an image, why does the intent have to start with a text query in a query box?
Why couldn’t they just describe the image they are looking for in other, more visual ways? This is another area that we are very excited about, and you can see the initial fruits of this labor in the Similar Images launch, which was launched last month. This allows you to look at search results and explore particular genres of images in more depth.
You can do a query like Paris, and you could imagine a good search engine might show you an image of Paris Hilton, Paris, Texas, and a picture of the Eiffel Tower. With Similar Images, you can dive into a particular image, and take along an image as a supplement query with your text query. This will allow you to be able to dive into that space and see images that are very similar to the Eiffel Tower image that you just clicked on.
Eric Enge: So you are using the image at that point as the next search query?
Peter Linsley: That’s right, that’s exactly how we look at it. We think it’s a very exciting space and we think it will help answer a lot of our user’ questions when they just can’t quite figure out how to describe the image they are looking for. Plus, certainly the area of computer vision is something we are very interested in, and we think it’s going to be able to provide answers to the numerous questions that our users have as they arrive on our site.
Eric Enge: I’ve heard that facial recognition software is already in use.
Peter Linsley: If you are a Picasa user, it is possible that you have already used this feature. It’s a very interesting way of tagging your images. Say I have photographs of my family members, Picasa will come back and tell me who this person is and we’ll go and tag all of my images of this particular person. It’s really cool technology, and I am sure you can imagine we’d be very interested in how this technology could be used in image search itself.
Eric Enge: Somebody might type in “Thomas Jefferson headshot” as a query. You would want to be able to distinguish between full body pictures and headshots, right? So I think that is an example of a pretty specific thing that is of interest.
Peter Linsley: One of the other things we launched relatively recently was the ability to filter your results down to images that just contain faces. That’s made available in the dropdown page, so it’s something that we are already doing today. I think you can imagine a whole lot of extra similar filters being useful to end users as well.
Eric Enge: Is there anything else you would like to add?
Peter Linsley: One topic that I am personally interested in is the area of outreach, and we are really interested in hearing more from webmasters about some of the issues they’ve perceived with image search and how we can collaborate with other search engines to try and help resolve some of these issues. I could think of a lot of examples in the web world where representatives from Google and other search engines have been up at search conferences, and they listened to the audience and helped them resolve problems that they may have had.
Most recently you can think of the canonical link tag, which is allowing webmasters to tell Google that this is the one URL we want you to index and treat as the canonical. The Sitemap is another good example to help the webmaster tell search engines more about their content and how to best crawl and index it and so on and so forth, so I am really interested in hearing more from webmasters who have image-centric sites or images on their sites.
There could be various ways that they think they can help us improve their ability to get images indexed and ranked and improve our end user experience at the same time. I am really excited to get more involved with the webmaster community as we go forward. We will be doing a lot more outreach from the image search team and just listening and trying to make it a win-win situation for everybody.
Eric Enge: What is the best way for someone who has an image search question to get in touch with the right person on the image search team?
Peter Linsley: We have the Google Web Search Forums, which are monitored. Various members of the image search team drop by, and we pretty much follow and respond to every image search related question, so I would suggest the so I would suggest posting your questions there.
Eric Enge: Thanks Peter!
Peter Linsley: Thank you, Eric!