Skip to content

encoding/csv: skipping of empty rows leads to loss of data in single-column datasets #39119

Open
@kokes

Description

@kokes

What version of Go are you using (go version)?

$ go version
go version go1.14 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/ondrej/Library/Caches/go-build"
GOENV="/Users/ondrej/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/ondrej/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/Cellar/go/1.14/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.14/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/hp/q7nph21s1q76nw1hv1hfxv2m0000gn/T/go-build988616937=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I wrote a CSV with a single column and missing data.

What did you expect to see?

I expected to load the data back, intact.

What did you see instead?

I lost the missing values, encoding/csv skipped them as it skips blank lines. In this case, a blank line actually represents data.


I'm not sure I understand the rationale behind skipping blank lines. Neither in terms of common practice (why would I have blank lines in my CSVs?) nor in terms of standards (the closest we have is RFC 4180 and I couldn't find anything about blank lines - so I'm not sure if Go follows it).

Here's a reproduction of the problem. I wrote a dataset into a file and was unable to read it back.

package main

import (
	"encoding/csv"
	"errors"
	"log"
	"os"
	"reflect"
)

func writeData(filename string, data [][]string) error {
	f, err := os.Create(filename)
	if err != nil {
		return err
	}
	defer f.Close()
	cw := csv.NewWriter(f)
	defer cw.Flush()
	if err := cw.WriteAll(data); err != nil {
		return err
	}
	return nil
}

func readData(filename string) ([][]string, error) {
	f, err := os.Open(filename)
	if err != nil {
		return nil, err
	}
	defer f.Close()
	cr := csv.NewReader(f)
	rows, err := cr.ReadAll()
	if err != nil {
		return nil, err
	}
	return rows, nil
}

func run() error {
	fn := "data/roundtrip.csv"
	data := [][]string{{"john"}, {"jane"}, {""}, {"jack"}}

	if err := writeData(fn, data); err != nil {
		return err
	}

	returned, err := readData(fn)
	if err != nil {
		return err
	}
	if !reflect.DeepEqual(returned, data) {
		log.Println("expected", data, "got", returned)
		return errors.New("not equal")
	}

	return nil
}

func main() {
	if err := run(); err != nil {
		log.Fatal(err)
	}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions