GraphML imports vertices but not edges - malformed on my end?

Jan 21, 2011 at 6:00 AM


I'm new to GraphML, so I suspect the problem is with my GraphML file's format somehow. But, here's the issue - I run "Import GraphML File", and it completes without throwing errors. However, my edges sheet is empty, and my vertices sheet has all the info I expect in that one. Here's a snippet of my GraphML file:

<?xml version="1.0" encoding="Windows-1252"?>  <graphml xmlns=""       xmlns:xsi=""     xsi:schemaLocation="">     <key id="a0" for="node""type" attr.type="string" />     <key id="a1" for="node""name" attr.type="string" />     <key id="a2" for="node""year" attr.type="int" />     <key id="a3" for="node""sex" attr.type="string" />     <key id="a4" for="node""imdb_rating" attr.type="double" />     <key id="a5" for="node""imdb_votes" attr.type="int" /> <graph id="G" edgedefault="undirected"><node id="643"> <data key="a0">actor</data> <data key="a1">Aaronovitch  Owen</data> <data key="a3">M</data> </node><node id="1395"> <data key="a0">actor</data> <data key="a1">Abbott  Dalton</data> <data key="a3">M</data> </node><node id="1302608"> <data key="a0">actor</data> <data key="a1">Abbott  Deborah</data> <data key="a3">F</data> </node><edge id="e1" source="121926" target="4742622" /><edge id="e2" source="570362" target="4742622" /><edge id="e3" source="881257" target="4742622" />      </graph>    </graphml>

Any idea what I screwed up there? The line spacing isn't the same in this cut + paste as in the actual file. Could that have something to do with it?

Oh, I'm using v.



Jan 21, 2011 at 7:25 AM

Hello, Libby:

Your GraphML is perfectly legal, and NodeXL happily reads it.  The problem is that the edges that are specified in the GraphML connect vertices that are not contained in the GraphML.  NodeXL skips such edges.

Look at this edge, for example:

<edge id="e1" source="121926" target="4742622" />

That line says that the edge connects the vertex that has the id "121926" to the vertex that has the id "4742622."  If you look through the GraphML, however, you'll see that there are no such vertices.  Thus, NodeXL skips the edge.

The following would be a valid edge that NodeXL would read, because it connects two vertices that are contained in the GraphML:

<edge id="e1" source="643" target="1395" />

You didn't say where the GraphML came from.  Was it generated by a program you wrote yourself?  If so, here is a good primer on what GraphML is supposed to look like:

-- Tony

Jan 21, 2011 at 2:31 PM
Hi Tony,

I was afraid my snippet would be wrong. The whole file is 28,000+
lines, so I didn't want to post the whole thing. Those lines are just
three from the node section and three from the edges section. I just
double checked to make sure that the IDs in the node section are
appearing as "source" and/or "target" in the edges section in the real
doc, and they are.

I used Ruby to create the GraphML file from a MySQL database - even
used that page you linked, actually. I'll try making a smaller file
that's identical in form, maybe just 10 nodes and their edges maybe,
and see if that works.

Jan 21, 2011 at 3:34 PM

Hi again,

Thanks for your help, Tony. You were right about it being a "source" and "target" id problem afterall. I'd done a calcuation on the IDs before inserting them, and then I did it backwards, bascially. Anyway, I appreciate your help, and now I have it right.