Fast way to create an edge between existing vertexes

Hi,

I am running a batch operation and with that, I am creating the Vertex and Edges from separate files. Since they both are running separately and Vertex being created before the edge files start getting processed, what is the fastest way to add the edge between two vertexes using Java.

I am able to work it b querying both the vertexes and then adding the edge but its taking 3x-4x then when I am adding the Vertex alone. I understand that match the Vertex performance might not be possible but 3-4x is a huge degradation of the performance.

Is there any way I could speed it up to a reasonable number? Pasting my code below.

String fromLabel = edgeRecord.getFromLabel();
String toLabel = edgeRecord.getToLabel();
String fromId = edgeRecord.getFromId();
String toId = edgeRecord.getToId();

    graph.createEdgeClass(edgeRecord.getName());

    String fromLabelQuery = "SELECT * from "+fromLabel+" where id = ? LIMIT 1";
    String toLabelQuery = "SELECT * from "+toLabel+" where id = ? LIMIT 1";

    long startTimeForFromNode = System.currentTimeMillis();
    OGremlinResultSet rs = graph.querySql(fromLabelQuery , fromId);
    Optional<OVertex> sourceVertex = Optional.empty();
    while (rs.getRawResultSet().hasNext()) {
        OResult item = rs.getRawResultSet().next();
        sourceVertex  = item.getVertex();
    }
    rs.close();

    //System.out.println(">>> Time taken for startTimeForFromNode : "+(System.currentTimeMillis()-startTimeForFromNode));

    long startTimeForToNode = System.currentTimeMillis();

    OGremlinResultSet rs_toVertex = graph.querySql(toLabelQuery, toId);
    Optional<OVertex> toVertex = Optional.empty();
    while (rs_toVertex.getRawResultSet().hasNext()) {
        OResult item = rs_toVertex.getRawResultSet().next();
        toVertex  = item.getVertex();
    }
    rs_toVertex.close();

// System.out.println(">>> Time taken for startTimeForToNode : "+(System.currentTimeMillis()-startTimeForToNode));

    long timeToaddTheEdge = System.currentTimeMillis();
    if(sourceVertex.isPresent() && toVertex.isPresent())
    {
        OVertex toVertexObj = toVertex.get();
        OVertex sourceVertexObj = sourceVertex.get();
        OEdge oedge =  sourceVertexObj.addEdge(toVertexObj, edgeRecord.getName());
        for(Map.Entry<String, Object> propEntry : edgeRecord.getProperties().entrySet())
        {
            oedge.setProperty(propEntry.getKey(), propEntry.getValue());
        }
        oedge.save();
    }

Hi @ODB19

The first thing I’d suggest is to check that you have an index on the vertex id property, it can make a big difference.

Apart from that, there is no specific procedure to speed up this operation, but you do some basic improvements based on the use case, eg. you can sort the edge creation by FROM vertex (or from TO vertex, based on which one is more convenient), this will likely allow you to load one of the vertices only once for multiple edge creations.

Thanks

Luigi

Thanks @luigidellaquila for getting back. Yes, I have Index on all the vertexes fields used for retrieval as without them, it was taking something like 2 days when I cancelled the operation. The other thing you suggested may not work for me as I am getting one edge at a time from a file. Looking for something which is contained in the same iteration.
Do you know of anyway I could may be make it all in one call?

I had a similar issue and was on serverless so even predicting the sequence was not possible.
There are two way i would do it,

  1. send a single query to orient. CREATE EDGE is_member_of FROM (SELECT FROM Group where name = 'B') TO (SELECT FROM Group where name = 'A');

  2. Cache the @rid in memory for FROM and TO vertexes. This will be useful only if you know that there are a lot of edges connecting them. I would cache upto 100K in a hashmap it’s just a normal string + rid which is super small and will fit in 10-20 MB. Use these rids instead of select queries in the edge creation. On top of that send 100 edge creations in batches to the DB server.