The Beginners Guide to Yahoo Pipes: Part 1

Recently, I have begun using Yahoo Pipes to quickly grab data from the web. Yahoo Pipes is a service designed to take in data from the web and manipulate it. Typically, data from the web is retrieved through RSS feeds. If you already use RSS feeds, you may find interest in Yahoo Pipes - but if you attempted to use it, you probably took one look and ran away. My goal is to show you how easy and useful Pipes can be!

An Introduction

Yahoo Pipes allows you to take data from multiple sources on the web (typically from RSS feeds) and manipulate it - you can merge (mash) data together, split data up, get more details, or get less details. You get to pick and choose what is displayed in the end result, a finalized feed.

The possibilities are endless - you can use this for lifestreaming - a combined aggregated feed of all your online activity. You could generate a “master” feed, that has all of your favorite news in one location. Or you can take some of that data - and do something with it.

A little background

If you are not a “Unix geek”, the term pipes might sound odd. From the Unix command line, the idea of “piping” data from one command to another is a common task. It is a powerful way to allow the commands to work together. This term comes from the simple idea that data “flows” from one command to the next, like it would through a pipe. Using this idea, Yahoo has created a tool that can allow a non-programmer to do complex things with data, using a simple graphical interface.

Why is it so great?

You can think of it as programming, without the code. While a little background in programming will certainly speed the learning curve, with a couple tips, you could become quite a little genius at “mashing up” data.

Why do I need to create my own pipes?

There are plenty of premade pipes out there for you to use. They might work well for what you need. But the real power with pipes comes with the ability to customize them the way YOU want them to be - not the way someone else thought was best. Sometimes, you can take a pipe that someone else created, and make a small change, so it does exactly what you wanted.

Let’s Get Started!

If you already have a Yahoo ID, you can go right to the Yahoo Pipes homepage and sign in. If not, go ahead and sign up.

As we begin to create pipes, your main page will show a list of pipes you have already created. Click the Create a pipe button to begin.

We will start with a pretty common use of Pipes - combining multiple RSS feeds. But first, we need to understand the basics of what makes up a pipe.

The Basics

Pipes are made from various modules that tell the pipe exactly what to do.

To create a pipe, each module is connected to the next. Every module will have connectors on them, allowing you to link them together. Depending on the module, it might have an input connector, an output connector, both an input and an output, or multiple inputs or outputs. Data will “flow” through the pipe, outputting from one module, and inputting into the next. If that sounds confusing, don’t worry, it will start to make sense.

Just think of it as a puzzle: some pieces work together, and some don’t, but they all fit together to solve the final puzzle.

Yahoo has provided a bunch of modules for your use, in the following categories:

Source modules - These modules will tell the pipe where to get data from.

User Inputs - These modules allow a user of a finalized pipe to specify certain required information dynamically, such as a url, a userid or search term.

Operators - These modules allow you to manipulate the source data

The URL, String, Date, Location, and Number modules are what I like to call “Helper Modules”, to make working with certain data a little easier.

Getting the Data

Pipe SourcesThe first thing we need to do is get some data. We do this by adding a “Source” - you can see the options that we have here - we can use a few pre-made sources, such as Yahoo Local, Yahoo Search, or Google Base. But we can also pull data from an RSS feed, using the ‘Fetch Feed’ source.

We will use the Fetch Feed source to get some data. Click and drag the Fetch Feed module into the workspace.

Source modules may require you to give it some information, in this case, the URL for the RSS feed. Once you have added the source, type the RSS feed into the text box, telling it where we want to get the data from. I am going to use the CNN Top Stories feed at http://rss.cnn.com/rss/cnn_topstories.rss

You’ll notice that this source module only has a single output connector. Output connectors are on the bottom of the module.

Making the Pipe Connection

Once you have added the source, you should see two boxes available - the Source, and the Pipe Output.

The Source is where we are getting the data from, in this case, a custom RSS feed.

The Pipe Output is where we want to end up, giving us our finalized data output. Notice the Pipe Output only has a single input connector.Input connectors are at the top of the module.

Connect the two modules by clicking and dragging from the Fetch Feed blue output connector, to the Pipe Output blue input connector. The results should look like this:

Congratulations! You have made a Yahoo Pipe! To see our results, we need to click the Pipe Output module, so it is orange, then look in the ‘Debug’ area at the bottom. You should see the results of your rss feed appear.

So far, all this does is give us the same information that a standard RSS feed would. Let’s work on improving that.

Adding Another Feed

What if I want to have multiple feeds combined together to get all my news from one feed? Let’s add another Fetch Feed box to the mix, this time adding MSNBC Top Stories at http://rss.msnbc.msn.com/id/3032091/device/rss/rss.xml

I can now see the data from this feed in the debugger at the bottom. Whichever module you click on (the highlighted orange one) is where the debugger window at the bottom will get its results from. This is a good way to make sure you are getting accurate information. If you were to click back on the CNN Fetch Feed, you will see the CNN results. However, our final results (The Pipe Output) is still only “linked” to show the CNN data. Let’s fix that.

Yahoo pipes provides a number of Operators that will allow you to manipulate data. There are some more advanced Operators, but all we want to do right now is combine (or Union) multiple RSS feeds.

Add a Union module into our Pipe:

Now that the Union is in place, you will notice you can connect up to four input modules to the top, and there is one output at the bottom. Let’s re-connect the feeds so they are combined, before going to the pipe output. To disconnect the current connected pipe, just click on the blue dot on Pipe Output, drag it off, and release. Then reconnect the two Fetch Feed modules to the Union input. Finally, connect the Union output to the Pipe Output.

Final Result

A brief review: This pipe is getting the CNN feed data, getting the MSNBC Feed data, then it combines (unions) them, and that data is then the final output for our pipe. Pretty neat, huh?

To make this useful, we are going to do one modification with the pipe. Right now, the results were combined together, but in no particular order. We are going to add a Sort Operator module, so that all the data comes out in the order we want it to.

Drag a Sort module from the Operators group into our workspace, and connect it in between the Union module and the Pipe module. When you connect the Union to the Sort module, it will populate the sort options with the available item details. In our case, we want to sort by the Published Date, or item.pubDate, in descending order - so our results will show the newest item at the top, regardless of whether the item is coming from CNN or MSNBC.


Saving and Running the Pipe

Once we are finished, we can save the pipe for later use. On the top right, click the Properties button, and fill in the information. You can also Publish this pipe, allowing the pipe to be available to the general public:

Click OK, then click the Save button at the top right corner.  You will see a link at the top that says ‘Pipe Saved .. Run Pipe’ - click the Run Pipe to see our final results:

You can see the final pipe that was created here.

Part Two

In Part Two of the series, we will begin to discuss some other source types, accepting user input, and more ways to modify our data - along with what we could do with the data once it has been generated.

Related Articles

  • September 13, 2008 at 11:59 pm Mike Fruchter
    Tim, excellent job on this tutorial!
  • September 14, 2008 at 12:08 am embee
    Pretty pretty neat ! Thanks a bunch!
  • September 14, 2008 at 4:24 am todd
    Very nice. Thanks. I've meant to learn something about Yahoo Pipes and now I have!

Viewing 1 Comment

close Reblog this comment
blog comments powered by Disqus