•  


Pg-vector upsert indexing overwrites saved section, vector, documents rows and only maintains a single row · Issue #708 · neuml/txtai · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pg-vector upsert indexing overwrites saved section, vector, documents rows and only maintains a single row #708

Closed
obonyojimmy opened this issue May 5, 2024 · 3 comments

Comments

@obonyojimmy
Copy link

Pg vector indexing keeps overwriting saved rows for sections , vector tables .

@davidmezzetti
Copy link
Member

I'll take a look and report back.

@davidmezzetti
Copy link
Member

davidmezzetti commented May 31, 2024

Sorry for the delay in checking this out. Could you share more on this?

For example, when I run the simple example below, it works as expected:

from
 txtai
 import
 Embeddings


# URL set in code for demo purposes. Use environment variables in production.

url
 =
 "postgresql+psycopg2://postgres:pass@localhost/postgres"


# Create embeddings

embeddings
 =
 Embeddings
(
    
content
=
url
,
    
backend
=
"pgvector"
,
    
pgvector
=
{
        
"url"
: 
url

    }
)

embeddings
.
index
([
"test"
])
embeddings
.
upsert
([
"test2"
])
embeddings
.
upsert
([
"test3"
])
print
(
embeddings
.
search
(
"test"
))

# [

#   {'id': '0', 'text': 'test', 'score': 1.0},

#   {'id': '1', 'text': 'test2', 'score': 0.6994196176528931},

#   {'id': '2', 'text': 'test3', 'score': 0.5753536224365234}

# ]

And in Postgres.

postgres=# select * from sections;
 indexid | id | text  | tags |           entry            
---------+----+-------+------+----------------------------
       0 | 0  | test  |      | 2024-05-31 08:24:52.931027
       1 | 1  | test2 |      | 2024-05-31 08:24:53.09685
       2 | 2  | test3 |      | 2024-05-31 08:24:53.10798

postgres=# select indexid from vectors;
 indexid 
---------
       0
       1
       2

Are you by chance passing the same ids? In that case, it would overwrite the same rows.

@davidmezzetti
Copy link
Member

Closing due to inactivity. Please re-open or open a new issue if there are further issues.

Sign up for free to join this conversation on GitHub . Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
- "漢字路" 한글한자자동변환 서비스는 교육부 고전문헌국역지원사업의 지원으로 구축되었습니다.
- "漢字路" 한글한자자동변환 서비스는 전통문화연구회 "울산대학교한국어처리연구실 옥철영(IT융합전공)교수팀"에서 개발한 한글한자자동변환기를 바탕하여 지속적으로 공동 연구 개발하고 있는 서비스입니다.
- 현재 고유명사(인명, 지명등)을 비롯한 여러 변환오류가 있으며 이를 해결하고자 많은 연구 개발을 진행하고자 하고 있습니다. 이를 인지하시고 다른 곳에서 인용시 한자 변환 결과를 한번 더 검토하시고 사용해 주시기 바랍니다.
- 변환오류 및 건의,문의사항은 juntong@juntong.or.kr로 메일로 보내주시면 감사하겠습니다. .
Copyright ⓒ 2020 By '전통문화연구회(傳統文化硏究會)' All Rights reserved.
 한국   대만   중국   일본