Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDF export should stream #101

Open
armisael opened this issue Nov 4, 2014 · 0 comments
Open

RDF export should stream #101

armisael opened this issue Nov 4, 2014 · 0 comments

Comments

@armisael
Copy link
Contributor

armisael commented Nov 4, 2014

I'm generating a large RDF using OpenRefine and the RDF-extension, and I'm getting an OutOfMemoryError. Looking at the full stacktrace (below) it seems to me that RdfExporter.buildModel is loading the whole graph in-memory; I'm not familiar with openRDF, so I'm asking: is it possible to change the exported to work in a stream-fashion? We don't really need to process the data twice, one to build the model and one to generate the triples, do we?

java.lang.OutOfMemoryError: Java heap space
    at org.openrdf.sail.memory.model.MemStatementList.growArray(MemStatementList.java:143)
    at org.openrdf.sail.memory.model.MemStatementList.add(MemStatementList.java:67)
    at org.openrdf.sail.memory.MemoryStore.addStatement(MemoryStore.java:595)
    at org.openrdf.sail.memory.MemoryStoreConnection.addStatementInternal(MemoryStoreConnection.java:418)
    at org.openrdf.sail.memory.MemoryStoreConnection.addStatementInternal(MemoryStoreConnection.java:379)
    at org.openrdf.sail.helpers.SailConnectionBase.addStatement(SailConnectionBase.java:331)
    at org.openrdf.repository.sail.SailRepositoryConnection.addWithoutCommit(SailRepositoryConnection.java:236)
    at org.openrdf.repository.base.RepositoryConnectionBase.addWithoutCommit(RepositoryConnectionBase.java:591)
    at org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:486)
    at org.deri.grefine.rdf.ResourceNode.addLinks(ResourceNode.java:100)
    at org.deri.grefine.rdf.ResourceNode.createNode(ResourceNode.java:119)
    at org.deri.grefine.rdf.exporters.RdfExporter$1.visit(RdfExporter.java:110)
    at com.google.refine.browsing.util.ConjunctiveFilteredRows.visitRow(ConjunctiveFilteredRows.java:76)
    at com.google.refine.browsing.util.ConjunctiveFilteredRows.accept(ConjunctiveFilteredRows.java:65)
    at org.deri.grefine.rdf.exporters.RdfExporter.buildModel(RdfExporter.java:123)
    at org.deri.grefine.rdf.exporters.RdfExporter.buildModel(RdfExporter.java:115)
    at org.deri.grefine.rdf.exporters.RdfExporter.export(RdfExporter.java:85)
    at com.google.refine.commands.project.ExportRowsCommand.doPost(ExportRowsCommand.java:101)
    at com.google.refine.RefineServlet.service(RefineServlet.java:177)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
    at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
    at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:155)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:326)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant