Description
Affects: PythonCall
Describe the bug
I have been trying to use pandas from PythonCall.jl and just wanted to document a few different calls that do not directly translate to Julia. I guess this might just mean we need a PythonPandas
package to translate calls but I wonder if there's any missing methods that could be implemented to fix things automatically.
First, the preamble for this:
using PythonCall
pd = pyimport("pandas")
- 1. Constructing
pandas.DataFrame
:
Using a similar syntax to Python:
df = pd.DataFrame(Dict([
"a" => [1, 2, 3],
"b" => [4, 5, 6]
]))
which results in the following dataframe:
julia> df
Python:
0
0 b
1 a
i.e., it seems to have a single column named "0" and rows for a and b.
If I instead write this as a vector of pairs, I get:
julia> pd.DataFrame([
"a" => [1, 2, 3],
"b" => [4, 5, 6]
])
Python:
0 1
0 a [1, 2, 3]
1 b [4, 5, 6]
I suppose this one makes sense.
I was able to get it working with the following syntax instead:
julia> df = pd.DataFrame([
1 4
2 5
3 6
], columns=["a", "b"])
Python:
a b
0 1 4
1 2 5
2 3 6
- 2. Selecting multiple columns
So, selecting a single column works:
julia> df["a"]
Python:
0 1
1 2
2 3
Name: a, dtype: int64
but multiple columns does not:
julia> df[["a", "b"]]
ERROR: Python: TypeError: Julia: MethodError: objects of type Vector{String} are not callable
Use square brackets [] for indexing an Array.
Python stacktrace:
[1] __call__
@ ~/.julia/packages/PythonCall/S5MOg/src/JlWrap/any.jl:223
[2] apply_if_callable
@ pandas.core.common ~/Documents/pysr_projects/arya/bigbench/.CondaPkg/env/lib/python3.12/site-packages/pandas/core/common.py:384
[3] __getitem__
@ pandas.core.frame ~/Documents/pysr_projects/arya/bigbench/.CondaPkg/env/lib/python3.12/site-packages/pandas/core/frame.py:4065
Stacktrace:
[1] pythrow()
@ PythonCall.Core ~/.julia/packages/PythonCall/S5MOg/src/Core/err.jl:92
[2] errcheck
@ ~/.julia/packages/PythonCall/S5MOg/src/Core/err.jl:10 [inlined]
[3] pygetitem(x::Py, k::Vector{String})
@ PythonCall.Core ~/.julia/packages/PythonCall/S5MOg/src/Core/builtins.jl:171
[4] getindex(x::Py, i::Vector{String})
@ PythonCall.Core ~/.julia/packages/PythonCall/S5MOg/src/Core/Py.jl:292
[5] top-level scope
@ REPL[18]:1
I got around this by inserting a pylist
call:
julia> df[pylist(["a", "b"])]
Python:
a b
0 1 4
1 2 5
2 3 6