Accessing Cloud Data
Also available as:
PDF
loading table of contents...

Unable to Execute HTTP Request: Read Timed Out

A read timeout means that the S3A client could not talk to the S3 service, and eventually gave up trying:

Unable to execute HTTP request: Read timed out
java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:170)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
    at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
    at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
    at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
    at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
    at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
    at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
    at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:66)
    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
    at org.apache.http.impl.client.DefaultRequestDirector.createTunnelToTarget(DefaultRequestDirector.java:902)
    at org.apache.http.impl.client.DefaultRequestDirector.establishRoute(DefaultRequestDirector.java:821)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:647)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
    at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1111)
    at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)

This is not uncommon in Hadoop client applications — there is a whole wiki entry dedicated to possible causes of the error.

For S3 connections, key causes are:

  • The S3 endpoint property fs.s3a.endpoint for the target bucket is invalid.

  • There's a proxy setting for the S3 client, and the proxy is not reachable or is on a different port.

  • The caller is on a host with fundamental connectivity problems. If a VM is on EC2, consider releasing it and requesting a new one.