TY - GEN
T1 - Anchor-Based Correction of Substitutions in Indexed Sets
AU - Lenz, Andreas
AU - Siegel, Paul H.
AU - Wachter-Zeh, Antonia
AU - Yaakobi, Eitan
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Motivated by DNA-based data storage, we investigate a system where digital information is stored in an unordered set of several vectors over a finite alphabet. Each vector begins with a unique index that represents its position in the whole data set and does not contain data. This paper deals with the design of error-correcting codes for such indexed sets in the presence of substitution errors. We propose a construction that efficiently deals with the challenges that arise when designing codes for unordered sets. Using a novel mechanism, called anchoring, we show that it is possible to combat the ordering loss of sequences with only a small amount of redundancy, which allows to use standard coding techniques, such as tensor-product codes to correct errors within the sequences. We finally derive upper and lower bounds on the achievable redundancy of codes within the considered channel model and verify that our construction yields a redundancy that is close to the best possible achievable one. Our results surprisingly suggest that it requires less redundancy to correct errors in the indices than in the data part of vectors.
AB - Motivated by DNA-based data storage, we investigate a system where digital information is stored in an unordered set of several vectors over a finite alphabet. Each vector begins with a unique index that represents its position in the whole data set and does not contain data. This paper deals with the design of error-correcting codes for such indexed sets in the presence of substitution errors. We propose a construction that efficiently deals with the challenges that arise when designing codes for unordered sets. Using a novel mechanism, called anchoring, we show that it is possible to combat the ordering loss of sequences with only a small amount of redundancy, which allows to use standard coding techniques, such as tensor-product codes to correct errors within the sequences. We finally derive upper and lower bounds on the achievable redundancy of codes within the considered channel model and verify that our construction yields a redundancy that is close to the best possible achievable one. Our results surprisingly suggest that it requires less redundancy to correct errors in the indices than in the data part of vectors.
UR - http://www.scopus.com/inward/record.url?scp=85073144438&partnerID=8YFLogxK
U2 - 10.1109/ISIT.2019.8849523
DO - 10.1109/ISIT.2019.8849523
M3 - Conference contribution
AN - SCOPUS:85073144438
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 757
EP - 761
BT - 2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE International Symposium on Information Theory, ISIT 2019
Y2 - 7 July 2019 through 12 July 2019
ER -